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i—H (54) Title: NOVEL N-ACETYLGALACTOSAMTNE TRANSFERASES AND NUCLEIC ACIDS ENCODING THE SAME 



(57) Abstract: An enzyme which transfers N-acetylgalactosamine to N-acetylglucosamine via a pi -4 linkage was isolated and the 
structure of its gene was explained. This led to the production of said enzyme or the like by genetic engineering techniques, the 
production of oligosaccharides using said enzyme, and the diagnosis of diseases on the basis of said gene or the like.The present 
invention uses a protein having the amino acid sequence shown in SEQ ID NO: 1, 3, 26 or 27 in the Sequence Listing or a variant of 
^5 said amino acid sequence wherein one or more acids are substituted or deleted, or one or more acids are inserted or added and having 
the activity of transferring N-acetylgalactosamine (GalNAc) to N-acetylglucosamine serving as a substrate via a pi-4 linkage and 
nucleic acids encoding said protein. 
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DESCRIPTION 

Novel N-Acetylgalactosamine Transferases and 
Nucleic Acids Encoding the Same 



5 Technical Field 

The present invention relates to novel enzymes having 
the activity of transferring N- acetylgalactosamine to N- 
acetylglucosamine via a pi -4 linkage and nucleic acids 
encoding the same, as well as to nucleic acids for assaying 
10 said nucleic acids. 



Background Art 

In various kinds of organisms, structures having a 
linkage of disaccharide of N- acetylgalactosamine ~N- 

15 acetylglucosamine have been found in oligosaccharides of 

glycoproteins and glycolipids [see References 1 and 2]. In 
humans, this disaccharide structure is known as a pl-4 
linkage (GalNAcpl-4GlcNAc) , and is found only in N-glycans 
[see Reference 3]. Methods for obtaining human -type 

20 oligosaccharides including said structure are limited to 
methods using complicated chemical synthesis and methods 
obtaining the oligosaccharides from natural proteins. 
Further, the above disaccharide structure includes in vivo a 
galactose substituted for a N-acetylgalactosamine. 

25 Therefore, it is a lengthy, laborious process to obtain 

oligosaccharides having the target disaccharide structure. 

Prior to the present application, the inventors 
identified ppGalNAc-TIO , -Til, -T12, -T13, -T14, -T15, -T16, 
-T17, CSGalNAc-Tl, and -T2 as enzymes having an activity of 
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transferring N- acetylgalactosamine to glucuronic acids and 
polypeptides , and further, they clarified the structures of 
these genes. Already known are at least 22 N- 
acetylgalactosamine transferases that have the activity of 
5 transferring N- acetylgalactosamine (Table 1), and each of 
the transferases have different specificities of acceptor 
substrates . 
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Disclosure of Invention 

Isolation of an enzyme having the activity of 
transferring N-acetylgalactosamine to N-acetylglucosamine 
via a pi- 4 linkage and an explanation of the structure of 
5 its gene enable the production of said enzyme or the like 

through genetic engineering techniques , and the diagnosis of 
diseases on the basis of said gene or the like. However, 
such an enzyme has not been isolated/purified yet and there 
is no key to isolating such an enzyme and identifying its 
10 gene. Therefore, no antibody against such an enzyme has 
been prepared. 

Therefore, the present invention provides a protein 
having an activity of transferring N-acetylgalactosamine to 
N-acetylglucosamlne via a (31-4 linkage and nucleic acids for 
15 encoding the same. The present invention also provides a 
cell introduced with a recombinant vector expressing said 
nucleic acids in a host cell and said nucleic acids, and 
expressing said nucleic acids and said proteins. Further, 
said protein expressed can be used for producing an antibody. 
20 Therefore, the present invention also provides a method for 
producing said protein. Further, the expressed protein and 
said antibody to the protein can be applied to 
immnohistochemical staining, and immunoassay of RIA and EIA 
and the like. Moreover, the present invention provides an 
25 analytical nucleic acid for assaying the above nucleic acid 
of the present invention. 

As described above, the objective enzymes have not 
yet been identified, and therefore, the partial sequence of 
the amino acids cannot be informed. In general, it is 
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difficult to isolate and purify proteins which are included 
in only a very small quantity in cells. Therefore, it is 
supposed that it is not easy to isolate enzymes which have 
so far not been isolated from cells. Thereat, the inventors 
5 tried to isolate and purify target enzymes, by making a 

region of which identity is thought to be high into a target, 
which may have the homologous sequence in nucleic acid 
sequences of genes between a objective enzyme and various 
kinds of enzymes having relatively similar activity. 
10 Specifically, the inventors first searched nucleic acid 

sequences of publicly- known [51 , 4 -galactose transferases, and 
identified homologous regions. Second, primers were 
designed based on these homologous regions, and a full- 
length open reading flam was identified from cDNA library by 
15 5' RACE (rapid amplification of cDNA ends) method. Further, 
the inventors succeeded in cloning a gene of said enzyme by 
PCR, and completed the present invention by determining 
nucleic acid sequences thereof and putative amino acid 
sequences . 

20 The present invention provides a protein having the 

activity of transf erring N- acetylgalactosamine and nucleic 
acid encoding the same, and thereby assists in satisfying 
these various requirements in the art . 

Namely, the present invention provides a mammal 

25 protein having the activity of transferring N- 

acetylgalactosamine to N-acetylglucosamine via a pl-4 
linkage . 

The human protein of the present invention has, 
typically, amino acid sequence of SEQ ID NO: 1 or 3, which 
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is presumed from nucleic acid sequence of SEQ ID NO: 2 or 4. 

The mouse protein of the present invention has amino 
acid sequence of SEQ ID NO: 26 or 28 # which is presumed from 
nucleic acid sequence of SEQ ID NO: 27 or 29. 
5 The present invention includes not only the protein 

having the amino acid sequence which is selected from a 
group consisting of SEQ ID NOs: 1, 3, 26 and 28 but also 
proteins having an identity of 50 % or more to said sequence. 
The present invention includes proteins having said amino 
10 acid sequence, wherein one or more amino acids are 

substituted or deleted, or one or more amino acids are 
inserted or added. 

The proteins of the present invention have amino acid 
sequences which have an identity of 60 % or more, preferably 
15 70 % or more, more preferably 80 % or more, still more 

preferably 90 %, and most preferably 95 % to the amino acid 
sequence which is selected from a group consisting of SEQ ID 
NOs: 1, 3, 26 and 28. 

The present invention provides nucleic acids encoding 
20 the protein of the present invention. 

The nucleic acids of the present invention have, 
typically, the nucleic acid sequence which is selected from 
a group consisting of SEQ ID NOs: 2, 4, 27 and 29, nucleic 
acid sequences in which one or more nucleic acids are 
25 substituted, deleted, inserted and/or added to the above 
nucleic acid sequence, or a nucleic acid sequence which 
hybridizes with said nucleic acid sequence under stringent 
conditions, and which includes the nucleic acids 
complementary to the above sequences. In one embodiment. 
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the present invention includes, but is not limited to, 
nucleic acids having the nucleic acid sequence represented 
by nucleotides 1-3120 of the nucleic acid sequence shown in 
SEQ ID NO: 2, nucleotides 1-2997 of the nucleic acid 
5 sequence shown in SEQ ID No: 4, nucleotides 1-3105 of the 

nucleic acid sequence shown in SEQ ID NO: 27, nucleotides 1- 
2961 of the nucleic acid sequence shown in SEQ ID No: 29. 

The present invention provides a recombinant vector 
containing the nucleic acids of the present invention. 
10 The present invention provides the transf ormants 

obtained by introducing the recombinant vector of the 
present invention into host cells. 

The present invention provides an analytical nucleic 
acid which hybridizes to the nucleic acids encoding the 
15 protein of the present invention under stringent conditions. 
The analytical nucleic acid preferably has the sequence 
shown in any one of SEQ ID NOs: 20, 21, 23 and 24 in the 
case of using the analytical nucleic acid of the present 
invention as a probe for assaying the nucleic acids encoding 
20 said protein. Further, the analytical nucleic acid of the 
present invention can be used as a cancer marker. 

The present invention provides an assay kit 
comprising the analytical nucleic acid which hybridizes to 
the nucleic acid of the present invention. 
25 The present invention provides the isolated antibody 

binding to the protein of the present invention or the 
monoclonal antibody thereof. 

Further, the present invention provides a method for 
determining a canceration of biological sample which 
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comprises a step of quantifying the protein or the nucleic 
acid of the present invention in the biological sample. 

Brief Description of Drawings 
5 Fig. 1 is a graph showing the quantitative analysis 

of expression level of NGalNAc-Tl or NGalNAc-T2 gene in 
various human tissues by the real time PCR. The axis of 
ordinates represents a relative ratio of expression level of 
NGalNAc-Tl or NGalNAc-T2 gene to that of a control 
10 glyceraldehyde-3-phsopate dehydrogenase (GAPDH) gene. The 
expressions of NGalNAc-Tl and NGalNAc-T2 gene are 
represented as a black bar and a white bar, respectively. 

Fig. 2 is a graph showing the quantitative analysis 
of expression level of NGalNAc-Tl (panel A) or NGalNAc-T2 
15 (panel B) gene in human lung cancerous tissue and normal 
tissue by the real time PCR. The axis of ordinates 
represents a relative ratio of expression level of NGalNAc- 
Tl or NGalNAc-T2 gene to that of a control human (3-actin 
gene. The axis of abscissas represents numbers relating to 
20 each patient. The normal tissue and the cancerous tissue 

are represented as a white bar and a black bar, respectively. 

Fig. 3 shows LacdiNAc synthesizing activity of 
NGalNAc-T2 toward asialo/agalacto-f etal calf fetuln. The 
asialo/agalacto-FCF appears as approximately 55 and 60 kDa 
25 band (lane 1). The NGalNAc-T2 effectively transfers GalNAc 
to asialo/agalacto-FCF (lane 5). The band mostly 
disappeared by GPF treatment ( lane 6 ) . 

Fig. 4 shows an analysis of N-glycan structures of 
glycodelin from NGalNAc-Tl and NGalNAc-T2 gene transfected 
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CHO cells. The non-reducing terminal GalNAc is detected 
only when NGalNAc-Tl or NGalNAc-T2 gene is co-transf ected 
with glycodelin gene. 

Pig. 5 shows one -dimensional X H NMR spectrum of the 
5 structure of GalNAcbl-4GlcNAc-0-Bz produced by NGalNAc-T2. 

Pig. 6 shows two-dimensional X H NMR spectrum of the 
structure of GalNAcbl-4GlcNAc-0-Bz produced by NGalNAc-T2. 

Detailed Description of the Invention 
10 In order to explain the present invention , a 

preferable embodiments for carrying out the invention are 
described in detail below. 

( 1 ) Proteins 

15 The nucleic acid encoding the human protein of the 

present invention cloned by the method described in detail 
in the examples below has the nucleotide sequence shown in 
SEQ ID NO: 2 or 4 in the Sequence Listing under which a 
deduced amino acid sequence encoded thereby is also shown. 

20 In addition, SEQ ID NO: 1 or 3 shows only said amino acid 
sequence. 

The proteins (hereinafter, denominated " NGalNAc - T 1 " 
and ™NGalNAc-T2 ,f ) of the present invention obtained in the 
examples below are enzymes having the properties listed 
25 below. In addition, each property of the proteins of the 
present invention and the method for determining the 
activity thereof are described in detail in the examples 
below. 
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Activity ; Transferring N- acetylgalactosamine to N- 
acetylglucosamine via a pi -4 linkage. The catalytic 
reaction is represented by 
the reaction formula: 

UDP-N-acetyl-D-galactosamine + N-acetyl-D-glucosamine-R 
-> UDP + N-acetyl-D-galactosaminyl-N-acetyl-D-glucosamine-R 

(UDP-GalNAc + GlcNAc-R -> UDP + GalNAc - GlcNAc - R ) 

Specific substrate : N-acetyl-glucosamine such as N~ 
acetylglucosamine pl-3-R (R is a residue of which hydroxyl 
group of mannose and p-nitrophenol and the like binds via an 
ether linkage). 

In a preferable embodiment, the proteins of the 
present invention have at least one of the following 
properties, preferably these properties: 
(A) Specificity of acceptor substrates 

(a) When O- linked oligosaccharides are used as an 
acceptor substrate, said proteins have the activity of 
transferring N-acetylgalactosamine to GlcNAcpi-6(Gal01- 
3)GalNAca-pNp (hereinafter, n core2-pNp ff ) , GlcNAcpl-3GalNAca- 
pNp (hereinafter, "coreS-pNp" ) , GlcNAc01-6GalNAca-pNp 
(hereinafter, n core6-pNp tt ) via a pi-4 linkage, wherein the 
abbreviations used are: GlcNAc, N-acetylglucosamine; GalNAc, 
N-acetylgalactosamine; Gal, galactose; pNp, p-nitrophenyl. 
Preferably, said proteins have the transferring activity to 
core6-pNp. 

(b) When N- linked oligosaccharides are used as an 
acceptor substrate, said proteins have the activity of 
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transferring N- acetylgalactosamine to GlcNAc at the non- 
reducing end of said oligosaccharides via a pi -4 linkage , 
provided that said activity reduces when said 
oligosaccharides have the following properties : 
5 (i) having fucose (Fuc) residues in the structure of 

said oligosaccharides; and 

(ii) having one or more branched chains wherein 
GalNAc residues bind to GlcNAc residues at the non-reducing 
end. 

10 (B) Optimum pH in enzymatic activity 

The activity tends to be higher in.pH 6.5 of MES (2- 
morpholineethanesulfonic acid) buffer. In HEPES ([4- (2- 
hydroxyethyl)-l-piperazinyl]ethanesulfonic acid) buffer, the 
activity tends to be higher in pH 6.75 for NGalNAc-Tl and pH 

15 7.4 for NGalNAc-T2. 

(C) Requirement of divalent ions 

In NGalNAc-Tl, the activity tends to be higher in the 
MES buffer including at least Mn 2+ , or Cu 2+ , preferably Mn 2+ . 
In NGalNAc-T2, the activity tends to be higher in the MES 

20 buffer including Mg 2+ , Mn 2+ , or Co 2+ , preferably Mg 2+ . 

The nucleic acid encoding the mouse protein of the 
present invention also has the nucleotide sequence shown in 
SEQ ID NO: 27 or 29 in the Sequence Listing under which a 
25 deduced amino acid sequence encoded thereby is also shown. 
In addition, SEQ ID NO: 1 or 3 shows only said amino acid 
sequence. The proteins (hereinafter, denominated "mNGalNAc- 
Tl" and " 110^11^0^2'') of the present invention are enzymes 
having the above properties . 
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The present invention provides a protein having an 
activity for transferring N-acetylgalactosamine to N- 
acetylglucosamine via a pl-4 linkage. So far as the 
5 proteins of the present invention have the properties 

described herein , the origins thereof and the method for 
producing them and the like are not limited. Namely , the 
proteins of the present invention include, for example, 
native proteins , proteins expressed from recombinant DNA 

10 using genetic engineering techniques, and chemically 
synthesized proteins. 

The protein of the present invention has typically an 
amino acid sequence consisting of 1039 amino acids shown in 
SEQ ID NO: 1, 998 amino acids shown in SEQ ID NO: 3, 1034 

15 amino acids . shown in SEQ ID NO: 26, or 986 amino acids shown 
in SEQ ID NO: 28. However, it is well-known that in native 
proteins, there are mutant proteins having one or more 
variants of amino acids, depending on a mutation of gene 
based on various species of organisms which produce the 

20 proteins , and various ecotypes , or a presence of very 

similar isozymes or the like. In addition, the term "mutant 
protein(s)" used herein means proteins and the like having a 
variant of said amino acid sequence, wherein one or more 
amino acids are substituted or deleted, or one or more amino 

25 acids are inserted or added in the amino acid sequence of 
SEQ ID NO: 1, 3, 26 or 28, and having the activity of 
transferring N-acetylgalactosamine to N-acetylglucosamine 
via a ($1-4 linkage. The expression "one or more" here 
preferably means 1-300, more preferably 1-100, and most 
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preferably 1-50. Generally, in the Instance that amino 
acids are substituted by site-specific variation, the number 
of amino acids that can be substituted to the extent that 
the activity of the original protein can be retained is 
5 preferably 1-10. 

Proteins of the present invention have the amino acid 
sequences of SEQ ID NO: 1 or 3 and SEQ ID NO: 2 or 4 (lower), 
or amino acid sequences of SEQ ID NO: 26 or 28 and SEQ ID 
NO: 27 or 29 (lower) based on the premise of nucleotide 
10 sequences of the cloned nucleic acids, but are not 

exclusively limited to the proteins having these sequences, 
and are intended to include all homologous proteins having 
the characteristics described herein. The identity is at 
least 50 % or more, preferably 60 %, more preferably 70 % or 
15 more, even more preferably 80 % or more, still more 

preferably 90 % or more, and most preferably 95 % or more. 

As used herein, the percentage identity of amino acid 
sequences can be determined by comparison with sequence 
information using, for example, the BLAST program described 
20 by Altschul et al. (Nucl. Acids. Res. 25, pp. 3389-3402, 

1997) or the FASTA program described by Pearson et al. (Proc. 
Natl. Acad. Sci. USA, pp. 2444-2448, 1988). These programs 
are available from the website of National Center for 
Biotechnology Information (NCBI) or DNA Data Bank of Japan 
25 (DDBJ) on the Internet. Various conditions (parameters) for 
homology searches with each program are described in detail 
on the site, and searches are normally performed with 
default values though some settings may be appropriately 
changed. Other programs used by those skilled in the art of 
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sequence comparison may also be used. 

Generally, a modified protein containing a change 
from one amino acid to another amino acid having similar 
properties (such as a change from a hydrophobic amino acid 
5 to another hydrophobic amino acid, a change from a 

hydrophilic amino acid to another hydrophilic amino acid, a 
change from an acidic amino acid to another acidic amino 
acid or a change from a basic amino acid to another basic 
amino acid) often has similar properties to those of the 
10 original protein. Methods for preparing such a recombinant 
protein having a desired variation using genetic engineering 
techniques are well known to those skilled in the art and 
such modified proteins are also included in the scope of the 
present invention . 
15 Proteins of the present invention can be obtained in 

bulk by, for example, introducing and expressing the DNA 
sequence of SEQ ID NO: 2, 4, 27 or 29 representing a nucleic 
acid of the present invention in E - coli, yeast, insect or 
animal cells using an expression vector capable of being 
20 amplified in each host, as described in the examples below. 
When the identity search of the protein of the 
present invention is performed using GENETYX (Genetyx Co.), 
the NGalNAc-Tl has 47.2 % identity to NGalNAc-T2, 84.3 % 
identity to mNGalNAc -Tl , and 47.4 % identity to mNGalNAc - T 2 . 
25 The NGalNAc-T2 has 46.5 % identity to mNGalNAc -Tl , and 

82.6 % identity to mNGalNAc -T2 . The mNGalNAc - T 1 has 46.3 % 
identity to mNGalNAc -T2 . 

The NGalNAc-Tl has 26.1 % identity in 226 amino acids 
of C terminus to CSGalNAc-Tl, while the NGalNAc-T2 has 
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21.6 % identity in 431 amino acids of C terminus to 
CSGalNAc-Tl and 25.0 % identity in 224 amino acids of C 
terminus to CSGalNAc-T2. 

Further, the NGalNAc-Tl has 19.3 % identity to human 
5 chondroitin synthase 1 (hCSSl) and 18.0 % identity to mouse 
chondroitin synthase 1 (mCSSl), while the NGalNAc-T2 has 
18.2 % to hCSSl and 18.1% to mCSSl. 

The mNGalN Ac ~ T 1 has 18.5 % identity to hCSSl and 
18.1 % identity to mCSSl, while the mNGalNAc - T 2 has 18.1 % 
10 identity to hCSSl and 18.8 % identity to mCSSl. 

Therefore, it is recognized that the protein of the 
present invention is a novel one. 

In addition, the protein of the present invention has 
the identity of 27 or more % to the amino acid sequence of 
15 SEQ ID NO: 1 or 3 . 

The protein of the present invention has the identity 
of 19 or more % to the amino acid sequence of SEQ ID NO: 26 
or 28. 

In addition, GENETYX is a genetic information 
20 processing software for nucleic acid analysis and protein 
analysis, which is capable of performing general homology 
analysis and multiple alignment analysis, as well as 
calculating a signal peptide, a site of promoter, and 
secondary structure. The program for homology analysis used 
25 herein adopts the Lipman-Pearson method (Lipman, D. J. & 
Pearson, W. R. , Science, 277, 1435-1441 (1985)) which is 
frequently used as a high speed, highly sensitive method. 

The amino acid sequences of the proteins and the DNA 
sequences encoding them disclosed herein can be wholly or 
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partially used to readily isolate genes encoding proteins 
having a similar physiological activity from that of other 
species using genetic engineering techniques Including 
hybridization and nucleic acid amplification reactions such 
5 as PCR. In such cases, novel proteins encoded by these 
genes can also be included in the scope of the present 
invention . 

Proteins of the present invention may contain an attached 
sugar chain if they have an amino acid sequence as defined 
10 above as well as the enzymatic activity described above. 

More specifically, as described in Examples 2 and 5 below, 
from the search of an acceptor substrate to the protein of 
the present invention, said protein acts to transfer GalNAc 

to GlcNAc via a pi -4 linkage. 
15 Furthermore specifically, the proteins of the present 

invention have at least one of the following properties (A)- 
(C), preferably all of these properties: 
(A) Specificity of acceptor substrates 

(a) When O-linked oligosaccharides are used as an 
20 acceptor substrate, said proteins have the activity of 
transferring N- acetylgalactosamine to GlcNAc(Jl-6(GalfH- 
3)GalNAca-pNp (hereinafter, "core2-pNp" ) , GlcNAcpl-3GalNAco> 
pNp (hereinafter, "core3-pNp" ) , GlcNAcpi-6GalNAca-pNp 
(hereinafter, "core6-pNp" ) via a pl-4 linkage, wherein the 
25 abbreviations used are: GlcNAc, N-acetylglucosamine; GalNAc, 
N-acetylgalactosamine; Gal, galactose; pNp, p-nitrophenyl. 
Preferably, said proteins have the transferring activity to 
core6-pNp. 

(b) When N- linked oligosaccharides are used as an 
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acceptor substrate, said proteins have the activity of 
transferring N-acetylgalactosamine to GlcNAc at the non- 
reducing end of said oligosaccharides via a pl-4 linkage, 
provided that said activity reduces when said 
5 oligosaccharides have the following properties: 

(i) having fucose (Fuc) residues in the structure of 
said oligosaccharides; and 

(ii) having one or more branched chains wherein GalNAc 
residues bind to GlcNAc residues at the non-reducing end. 

10 (B) Optimum pH in enzymatic activity 

The activity tends to be higher in pH 6.5 of MES (2- 
morpholineethanesulfonic acid) buffer. In HEPES ([4-(2- 
hydroxyethyl)-l-piperazlnyl]ethanesulfonic acid) buffer, the 
activity tends to be higher in pH 6.75 for NGalNAc-Tl and pH 
15 7.4 for NGalNAc-T2. 

(C) Requirement of divalent ions 

In NGalNAc - T 1 , the activity tends to be higher in the MES 
buffer including at least Mn 2+ , or Co 2+ . preferably Mn 2+ . In 
NGalNAc -T2, the activity tends to be higher in the MES 
20 buffer including Mg 2+ , Mn 2+ , or Co 2+ , preferably Mg 2+ . 

(2) Nucleic acids 

Nucleic acids of the present invention include DNA in 
both single- stranded and double -stranded forms, as well as 
25 the RNA complements thereof. DNA includes, for example, 

native DNA, recombinant DNA, chemically synthesized DNA, DNA 
amplified by PCR and combinations thereof. The nucleic acid 
of the present invention is preferably a DNA. 

The nucleic acids of the present invention are 
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nucleic acids (including the complement thereof) encoding 
the amino acids shown in SEQ ID NO: 1, 3, 26 or 28. 
Typically, the nucleic acids of the present invention have 
the nucleic acid sequence of SEQ ID NO: 2, 4, 27 or 29 
5 (including the complements thereof), which are clones 

obtained in the working example below which shows simply an 
example of the present invention. It is well-known for a 
person skilled in the art that in native nucleic acids, 
there are minor mutants derived from various kinds of 
10 species which produce them and ecotypes and mutants from a 
presence of isozymes. Therefore, the nucleic acids of the 
present Invention include, but are not limited to, the 
nucleic acids having the nucleic acid sequence shown in SEQ 
ID NO: 2, 4, 27 or 29 . The nucleic acids of the present 
15 invention Include all nucleic acids encoding the proteins of 
the present invention. 

Particularly, the amino acid sequences of the 
proteins and the DNA sequences encoding them disclosed 
herein can be wholly or partially used to readily isolate 
20 nucleic acids encoding proteins having a similar 

physiological activity from that of other species using 
genetic engineering techniques including hybridization and 
nucleic acid amplification reactions such as PCR. In such 
cases, such nucleic acids can also be included in the scope 
25 of the present invention. 

As used herein, "stringent conditions" means 
hybridization under conditions of moderate or high 
stringency. Specifically, conditions of moderate stringency 
can be readily determined by those having ordinary skill in 
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the art based on, for example, the length of the DNA. The 
basic conditions are shown by Sambrook et al . . Molecular 
Cloning: A Laboratory Manual, 3rd edition. Vol. 1, 7.42-7.45 
Cold Spring Harbor Laboratory Press, 2001 and include use of 
a prewashing solution for the nitrocellulose filters of 5 x 
SSC, 0.5 % SDS, 1.0 mM EDTA (pH 8.0), hybridization 
conditions of about 50 % formamide, 2 x SSC - 6 x SSC at 
about 40-50 °C (or other similar hybridization solution 
such as Stark's solution, in about 50 % formamide at about 
42 °C), and washing conditions of 0.5 x SSC, 0.1 % SDS at 
about 60 °C. Conditions of high stringency can also be 
readily determined by those skilled in the art based on, for 
example, the length of the DNA. Generally, such conditions 
include hybridization and/or washing at a higher temperature 
and/or a lower salt concentration as compared with 
conditions of moderate stringency and are defined as 
hybridization conditions as above followed by washing in 0.2 
x SSC, 0.1 % SDS at about 68 °C. Those skilled in the art 
will recognize that the temperature and the salt 
concentration of the washing solution can be adjusted as 
necessary according to factors such as the length of the 
probe . 

Nucleic acid amplification reactions include 
reactions involving temperature cycles such as polymerase 
chain reaction (PCR) [Saiki R.K. et al.. Science, 230, 1350- 
1354 (1985)], ligase chain reaction (LCR) [Wu D.Y. et al.. 
Genomics, 4, 560-569 (1989); Barringer K.J. et al. , Gene. 89 
117-122 (1990); Barany F. , Proc. Natl. Acad. Sci. USA, 88, 
189-193 (1991)] and transcription-based amplification [Kwoh 
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D.Y. et al., Proc. Natl. Acad. Sci. USA, 86, 1173-1177 
(1989)] as well as isothermal reactions such as strand 
displacement amplification (SDA) [Walker G.T. et al.. Proc. 
Natl. Acad. Sci. USA, 89, 392-396 (1992); Walker G.T. et al. , 
5 Nuc. Acids Res., 20, 1691-1696 (1992)], self -sustained 

sequence replication (3SR) [Guatelli J.C., Proc. Natl. Acad. 
Sci. USA, 87, 1874-1878 (1990)], and QB replicase system 
[Llzardi et al. , BioTechnology, 6, 1197-1202 (1988)]. Other 
reactions such as nucleic acid sequence -based amplification 
10 (NASBA) using competitive amplification of a target nucleic 
acid and a variant sequence disclosed in European Patent No. 
0525882 can also be used. PCR is preferred. 

Homologous nucleic acids cloned by hybridization, 
nucleic acid amplification reactions or the like as 
15 described above have an identity of at least 50 % or more, 
preferably 60 % or more, more preferably 70 % or more, even 
more preferably 80 % or more, still more preferably 90 % or 
more, and most preferably 95 % or more to the nucleotide 
sequence of SEQ ID NO: 2, 4, 27 or 29 in the Sequence 
20 Listing. 

The percentage identity of nucleic acid sequences may 
be determined by visual inspection and mathematical 
calculation. Alternatively, the percentage identity of two 
nucleic acid sequences can be determined by comparing 
25 sequence information using the GAP computer program, version 
6.0 described by Devereux et al. , Nucl. Acids Res., 12:387 
(1984) which is available from the University of Wisconsin 
Genetics Computer Group (UWGCG) . The preferred default 
parameters for the GAP program include: (1) a unary 
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comparison matrix (containing a value of 1 for identities 
and 0 for non- identities) for nucleotides, and the weighted 
comparison matrix of Gribskov and Burgess, Nucl. Acids Res., 
14:6745 (1986), as described by Schwartz and Dayhoff, eds; 
5 Atlas of Protein Sequence and Structure, National Biomedical 
Research Foundation, pp. 353-358 (1979); (2) a penalty of 
3.0 for each gap and an additional 0.10 penalty for each 
symbol in each gap; and (3) no penalty for end gaps. Other 
programs used by one skilled in the art of sequence 
10 comparison may also be used. 

When the identity search of the nucleic acid of the 
present invention is performed using GENETYX (Genetyx Co.), 
the NGalNAc-Tl has 59.7 % identity to NGalNAc-T2, 81.4 % 
identity to mNGalNAc-Tl, and 59.0 % identity to mNGalNAc-T2 . 
15 The NGalNAc-T2 has 59.7 % identity to mNGalNAc-Tl , and 

83.4 % identity to mNGalNAc-T2. The mNGalNAc-Tl has 59.6 % 
identity to mNGalNAc-T2 . 

The NGalNAc-Tl has 44.6 % identity to hCSSl and 
46.0 % identity to mCSSl, while the NGalNAc-T2 has 47.3 % to 
20 hCSSl and 47.9 % to mCSSl. 

The mNGalNAc - T 1 has 46.4 % identity to hCSSl and 
46.6 % identity to mCSSl, while mNGalNAc - T 2 has 48.6 % 
identity to hCSSl and 48.7 % Identity to mCSSl . 

Therefore, it is recognized that the nucleic acid of 
25 the present invention is a novel one. 

In addition, the nucleic acid of the present 
invention has the identity of 48 or more % to the amino acid 
sequence of SEQ ID NO: 2 or 4. 

The nucleic acid of the present invention has the 
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identity of 49 or more % to the amino acid sequence of SEQ 
ID NO: 27 or 29. 

(3) Recombinant vectors and transf ormants 
5 The present invention provides the recombinant 

vectors containing the nucleic acid of the present invention. 
Methods for integrating a DNA fragment of a nucleic acid of 
the present invention into a vector such as a plasmid are 
described in, for example, Sambrook, J. et al. , Molecular 
10 Cloning, A Laboratory Manual (3rd edition). Cold Spring 
Harbor Laboratory, 1.1 (2001). Commercially available 
ligation kits (e.g., those available from Takara Shuzo Co., 
Ltd.) can be conveniently used. Thus obtained recombinant 
vectors (e.g., recombinant plasmids) are introduced into 
15 host cells (e.g., E. coli, TBI, LE392, or XL-lBlue, etc.). 

Suitable methods for introducing a plasmid into a 
host cell include the use of calcium chloride or calcium 
chloride /rubidium chloride or calcium phosphate, 
electroporation, electro injection, chemical treatment with 
20 PEG or the like, and the use of a gene gun as described in 
Sambrook, J. et al. , Molecular Cloning, A Laboratory Manual 
(3rd edition). Cold Spring Harbor Laboratory, 16.1 (2001). 

Vectors can be conveniently prepared by linking a 
desired gene by a standard method to a recombination vector 
25 available in the art (e.g., plasmid DNA). Specific examples 
of suitable vectors include, but are not limited to, E. 
coli-derived plasmids such as pBluescript, pUC18, pUC19 and 
pBR 322. 

In order to produce desired proteins, especially, 
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expression vectors are useful. The types of expression 
vectors are not specifically limited to those having the 
ability to express a desired gene in various prokaryotic 
and/or eukaryotic host cells to produce a desired protein, 
5 but preferably include expression vectors for E. coli such 
as pQE-30, pQE-60, pMAL-C2, pMAL-p2, pSE420; expression 
vectors for yeasts such as pYES2 (genus Saccharomyces ) , 
pIC3.5K, pPIC9K, pA0815 (all belonging to genus Pichia) ; and 
expression vectors for Insects such as pBacPAK8/9, pBK283, 
10 pVL1392, pBlueBac4.5. 

A transformant can be produced by introducing a 
desired expression vector into a host cell. The host cells 
employed are not specifically limited to those having the 
ability to be compatible to the expression vector of the 
15 present invention and to be able to be transformed, but 

various kinds of cells such as native cells are usually used 
in the art or recombinant cells are artificially established. 
For example, bacteria (genus Escherichia, genus Bacillus), 
yeasts (genus Saccharomyces, genus Pichia, etc.), mammalian 
20 cells, insect cells, and plant cells are exemplified. 

The host cells are preferably E. coli, yeasts and 
Insect cells, which are exemplified as E. coli (M15, JM109, 
BL21, etc.), yeasts (INVScl (genus Saccharomyces), GS115, 
KM71 (genus Pichia), etc.), and insect cells (BmN4, bombic 
25 larva, etc.). Examples of animal cells are mouse, Xenopus, 
rat, hamster, monkey or human derived cells or culture cell 
lines established from these cells. More specifically, the 
host cell is preferably COS cell which is a cell line 
derived from a kidney of monkey. 
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When a bacterium, especially E. coll is used as a 
host cell, the expression vector typically consists of at 
least a promoter/ operator region, a start codon, a gene 
encoding a desired protein, a stop codon, a terminator and a 
5 replicable unit. 

When a yeast, plant cell, animal cell or insect cell 
is used as a host cell, the expression vector typically 
preferably contains at least a promoter, a start codon, a 
gene encoding a desired protein, a stop codon and a 
10 terminator. It may also contain a DNA encoding a signal 

peptide, an enhancer sequence, untranslated regions at the 
5' and 3' ends of a desired gene, a selectable marker region 
or a replicable unit, etc., if desired. 

Preferred start codons in vectors of the present 
15 invention include a methionine codon (ATG) . Stop codons 
include commonly used stop codons (e.g., TAG, TGA, TAA) . 

The replicable unit means DNA capable of replicating 
the entire DNA sequence in a host cell, such as natural 
plasmids, artificially modified plasmids (plasmids prepared 
20 from natural plasmids), synthetic plasmids, etc. Preferred 
plasmids include plasmid pQE30, pET or pCAL or their 
artificial variants (DNA fragments obtained by treating 
pQE30, pET or pCAL with suitable restriction endonucleases ) 
for E. coli; plasmid pYES2 or pPIC9K for yeasts; and plasmid 
25 pBacPAK8/9 for insect cells. 

Enhancer sequences and terminator sequences may be 
those commonly used by those skilled in the art such as 
those derived from SV40. 

As, for selectable markers, those commonly used can be 
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used by standard methods. Examples are genes resistant to 
antibiotics such as tetracycline, ampicillin, kanamycin, 
neomycin, hygromycin or spectinomycin . 

Expression vectors can be prepared by linking at 

5 least a promoter, a start codon, a gene encoding a desired 
protein, a stop codon and a terminator region as described 
above to a suitable replicable unit in series into a circle. 
While carrying out the linking process, a suitable DNA 
fragment (such as a linker or another restriction site) can 

10 be used by standard methods such as digestion with a 

restriction endonuclease or ligation with T4 DNA ligase, if 
desired. 

Introduction [transformation (transduction)] of 
expression vectors of the present invention into host cells 
15 can be performed by using known techniques. 

For example, bacteria (such as E. coli. Bacillus 
subtilis) can be transformed by the method of Cohen et al. 
[Proc. Natl. Acad. Sci. USA, 69, 2110 (1972)], the 
protoplast method [Mol. Gen. Genet., 168, 111 (1979)] or the 
20 competent method [J. Mol. Biol.. 56, 209 (1971)]; 

Saccharomyces cerevisiae can be transformed by the method of 
Hinnen et al [Proc. Natl. Acad. Sci. USA, 75, 1927 (1978)] 
or the lithium method [J.B. Bacteriol. , 153, 163 (1983)]; 
plant cells can be transformed by the leaf disc method 
25 [Science, 227, 129 (1985)] or electroporatlon [Nature, 319, 
791 (1986)]; animal cells can be transformed by the method 
of Graham [Virology, 52, 456 (1973)]; and insect cells can 
be transformed by the method of Summers et al. [Mol. Cell. 
Biol., 3, 2156-2165 (1983)]. 
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(4) Isolation/purification of proteins 

Proteins of the present invention can be expressed 
(produced) by culturing transformed cells containing an 
5 expression vector prepared as described above in a nutrient 
medium. The nutrient medium preferably contains a carbon, 
inorganic nitrogen or organic nitrogen source necessary for 
the growth of host cells (transformants) . Examples of 
carbon sources include glucose, dextran, soluble starch, 
10 sucrose and methanol. Examples of inorganic or organic 
nitrogen sources include ammonium salts, nitrates, amino 
acids, corn steep liquor, peptone, casein, beef extract, 
soybean meal and potato extract. If desired, other 
nutrients (e.g., inorganic salts such as sodium chloride, 
15 calcium chloride, sodium dihydrogen phosphate and magnesium 
chloride; vitamins; antibiotics such as tetracycline, 
neomycin, ampicillln and kanamycin) may be contained. 
Incubation of cultures takes place by techniques known in 
the art. Culture conditions such as temperature, the pH of 
20 the medium and the Incubation period are appropriately 

selected to produce a protein of the present invention in 
mass . 

Proteins of the present invention can be obtained 
from the resulting cultures as follows. That is, when 
25 proteins of the present invention accumulate in host cells, 
the host cells are collected by centrifugatlon or filtration 
or the like and suspended in a suitable buffer (e.g., a 
buffer such as a Tris buffer, a phosphate buffer, an HEPES 
buffer or an MES buffer at a concentration of about 10 M - 
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100 mM desirably at a pH in the range of 5.0 - 9.0. though 
the pH depends on the buffer used), then the cells are 
disrupted by a method suitable for the host cells used and 
centrifuged to collect the contents of the host cells. When 
5 proteins of the present Invention are secreted from host 
cells, the host cells and culture medium are separated by 
centrlfugation or filtration or the like to give a culture 
filtrate. The disruption solution of the host cells or the 
culture filtrate can be used to isolate/purify a protein of 
10 the present invention directly or after ammonium sulfate 

precipitation and dialysis. An isolation/purification method 
is as follows. When the protein of interest is tagged with 
6 x histidine, GST, maltose -binding protein or the like, 
conventional methods based on affinity chromatography 
15 suitable for each tag can be used. When the protein of the 
present invention is produced without using these tags, the 
method described in detail in the examples below based on 
ion exchange chromatography can be used, for example. These 
methods may be combined with gel filtration chromatography, 
20 hydrophobic chromatography, isoelectric chromatography or 
the like. 

N- acetylgalactosamine is transferred by the action of 
proteins of the present invention on glycoprotein, 
oligosaccharide, polysaccharide or the. like having N- 
25 acetylglucosamine. Thus, proteins of the present invention 
can be used to modify a sugar chain of a glycoprotein or to 
synthesize a sugar. Moreover, the proteins can be 
administered as immunogens to an animal to prepare 
antibodies against said proteins, and said antibodies can b<= 
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used to determine said proteins by immunoassays. Thus, 
proteins of the present invention and the nucleic acids 
encoding them are useful in the preparation of such 
immunogens . 

5 Further, proteins of the present invention can 

comprise peptides added to facilitate purification and 
identification. Such peptides include, for example, poly- 
His or the antigenic identification peptides described in US 
Patent No. 5,011,912 and in Hopp et al., Bio/Technology, 

10 6:1204, 1988. One such peptide is the FLAG® peptide. Asp- 
Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 30) which is highly 
antigenic and provides an epitope reversibly bound by a 
specific monoclonal antibody, enabling rapid assay and 
facile purification of expressed recombinant protein. A 

15 murine hybridoma designated 4E11 produces a monoclonal 

antibody that binds the FLAG® peptide in the presence of 
certain divalent metal cations, as described in US Patent No. 
5,011,912 hereby incorporated by reference. The 4E11 
hybridoma cell line has been deposited with the American 

20 Type Culture Collection under Accession No. HB 9259. 
Monoclonal antibodies that bind the FLAG® peptide are 
available from Eastman Kodak Co., Scientific Imaging Systems 
Division , New Haven, Connecticut. 

Specifically, the cDNA of the FLAG is inserted into 

25 an expression vector expressing a protein of the present 
Invention to express the FLAG- tagged protein , after which 
the expression of the protein of the present invention can 
be confirmed by an anti-FLAG antibody. 
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(5) Analytical nucleic acid 

According to the present invention, a nucleic acid 
which hybridizes to the nucleic acids of the present 
invention (hereinafter referred to as "analytical nucleic 

5 acid") is provided. The analytical nucleic acid of the 

present invention includes, but is not limited to, typically, 
native or synthesized fragments derived from nucleic acid 
encoding the protein of the present invention. As used 
herein, the term "analytical" includes any of detection, 

10 amplification, quantitative and semi-quantitative assays. 

(a) Primers 

When analytical nucleic acids of the present 
invention are used as primers for nucleic acid amplification 
15 reactions, the analytical nucleic acids of the present 
invention are oligonucleotides prepared by a process 
comprising: 

selecting two regions from the nucleotide sequence of 
a gene encoding a protein of SEQ ID NO: 1, 3, 26 or 28 to 
20 satisfy the conditions that: 

1) each region should have a length of 15-50 bases; 

and 

2) the proportion of G + C in each region should be 
40-70 %; 

25 generating a single- stranded DNA having a nucleotide 

sequence identical to or complementary to that of said 
region or generating a mixture of single- stranded DNAs 
taking into account degeneracy of the genetic code so that 
the amino acid residue encoded by said single- stranded DNA 
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is retained, and, as necessary, generating the single- 
stranded DNA containing a modification without affecting the 
binding specificity to the nucleotide sequence of the gene 
encoding said protein. 



sequence homologous to that of a partial region of a nucleic 
acid of the present invention, but one to two bases may be 
mismatched. 

Primers of the present invention contain 15 bases or 
10 more, preferably 18 bases or more, more preferably 21 bases 
or more, and 50 bases or fewer bases. 

The primer of the present invention has typically the 
nucleic acid sequence selected of a group consisting of SEQ 
ID NO: 20, 21, 23 and 24, and can be used as a single primer 
15 or a suitably combined pair of primers. These nucleotide 

sequences were designed based on amino acid sequence of SEQ 
ID 1 or 3 as a PCR primer for cloning gene fragments 
encoding each protein. The sequence is a primer mixed with 
all nucleic acids capable of encoding said amino acids. 



(b) Probes 

When analytical nucleic acids of the present 
invention are used as probes, the analytical nucleic acids 
of the present invention preferably have a sequence 
25 homologous to that of a total or partial region of the 
nucleotide sequence of SEQ ID NO: 2, 4, 27 or 29, and 
further , may have a mismatch of one or two bases. The 
probes of the present invention have a length of 15 bases 
and more, preferably 20 bases and more, and within a full 
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length of the encoding region, that is, 3120 bases 
(corresponding to SEQ ID NO: 2), 2997 bases (corresponding 
to SEQ ID NO: 4), 3105 bases (corresponding to SEQ ID NO: 
27), or 2961 bases (corresponding to SEQ ID NO: 29). The 
probes have typically the nucleic acid sequence shown in SEQ 
ID NO: 22 or 25. The probes may be obtained from native 
nucleic acid treated with restriction enzymes, or may be 
synthesized oligonucleotides . 

Probes of the present invention include labeled 
probes having a label such as a fluorescent, radioactive or 
biotinylation label to detect or confirm that the probes 
have hybridized to a target sequence. The presence of a 
nucleic acid to be tested in an analyte can be determined by 
immobilizing the nucleic acid to be tested or an 
amplification product thereof, hybridizing it to a labeled 
probe, and after washing, measuring the label bound to the 
solid phase. Alternatively, it can also be determined by 
immobilizing the analytical nucleic acid, hybridizing to the 
nucleic acid to be tested and detecting the nucleic acid to 
be tested coupled to the solid phase with a labeled probe or 
the like. In the latter case, the immobilized analytical 
nucleic is also referred to as a probe. 

Generally, nucleic acid amplification methods such as 
PCR can be readily performed because they are per se well 
known in the art, and reagent kits and apparatus for them 
are also commercially available. When a nucleic acid 
amplification method is performed using a pair of analytical 
nucleic acids of the present invention described above as 
primers and a nucleic acid to be tested as the template, the 
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presence of the nucleic acid to be tested in a sample can be 
known by detecting an amplification product because the 
nucleic acid to be tested is amplified while no 
amplification occurs when the nucleic acid to be tested is 
5 not contained in the sample- The amplification product can 
be detected by electrophoresing the reaction solution after 
amplification, staining the bands with ethidium bromide, 
immobilizing the amplification product after electrophoresis 
to a solid phase such as a nylon membrane, hybridizing the 
10 immobilized product with a labeled probe that specifically 
hybridizes to the nucleic acid to be tested, and washing the 
hybridization product and then detecting said label. 
Further, the amount of the nucleic acid to be tested in a 
sample can also be determined by the so-called real-time PCR 
15 detection using a quencher fluorescent dye and a reporter 
fluorescent dye. This method can also be readily carried 
out using a commercially available real-time PCR detection 
kit. The nucleic acid to be tested can also be semi- 
quantitatively assayed based on the intensity of 
20 electrophoretic bands. The nucleic acid to be tested may be 
mRNA or cDNA reversely transcribed from mRNA. When mRNA is 
to be amplified as the nucleic acid to be tested, the NASBA 
methods (3SR, TMA) can also be adopted using said pair of 
primers. The NASBA methods can be readily performed because 
25 they are per se well known and kits for them are 
commercially available. 

(c) Microarrays 

Analytical nucleic acids of the present invention can 
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be used as microarrays. Microarrays are means for enabling 
rapid large-scale data analysis of genomic functions. 
Specifically, a labeled nucleic acid is hybridized to a 
number of different nucleic acid probes immobilized in high 

5 density on a solid substrate such as a glass substrate, a 
signal from each probe is detected and the collected data 
are analyzed. As used herein, the "microarray" means an 
array of an analytical nucleic acid of the present invention 
on a solid substrate such as a membrane, filter, chip or 

10 glass surface. 

(6) Antibodies 

An antibody that is immunoreactive with the protein 
of the present invention is provided herein. Such an 
15 antibody specifically binds to the polypeptide via the 
antigen-binding site of the antibody (as opposed to non- 
specific binding). Therefore, as set forth above, proteins 
of SEQ ID NOs: 1 and 3, fragments, variants, and fusion 
proteins and the like can be used as " immunogens " in 
20 producing antibodies immunoreactive therewith. More 

specifically, the proteins, fragments, variants, and fusion 
proteins and the like include the antigenic determinants or 
epitopes to induce the formation of an antibody. Such 
antigenic determinants or epitopes may be either linear or 
25 conformational (discontinuous). In addition, said antigenic 
determinants or epitopes may be identified by any methods 
known in the art. 

Therefore, one aspect of the present invention 
relates to the antigenic epitopes of the protein of the 
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present invention. Such epitopes are useful raising 
antibodies, in particular monoclonal antibodies, as 
described in more detailed below. Additionally, epitopes 
from the protein of the present invention can be used as 
research reagents, in assays, to purify specific binding 
antibodies from substances such as polyclonal sera or 
supernatants from cultured hybridomas. Such epitopes or 
variants thereof can be produced using techniques known in 
the art such as solid-phase synthesis, chemical or enzymatic 
cleavage of a protein, or by using recombinant DNA 
technology. 

As for antibodies which can be Induced by the 
proteins of the present invention, both polyclonal and 
monoclonal antibodies can be prepared by conventional 
techniques, whether a whole body or a part of said proteins 
have been isolated, or the epitopes have been isolated. See, 
for example. Monoclonal Antibodies, Hybridomas: A New 
Dimension in Biological Analyses, Plenum Press, NY, 1980. 

Hybridoma cell lines that produce monoclonal 
antibodies specific for the proteins of the present 
invention are also contemplated herein. Such hybridomas can 
be produced and identified by conventional techniques. One 
method for producing such a hybridoma cell line comprises 
immunizing an animal with a protein of the present 
invention; harvesting spleen cells from the immunized 
animal; fusing said spleen cells to a myeloma cell line, 
thereby generating hybridoma cells; and identifying a 
hybridoma cell line that produces a monoclonal antibody that 
binds said protein. The monoclonal antibodies can be 
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recovered by conventional techniques. 

The antibodies of the present invention include 
chimeric antibodies such as humanized versions of murine 
monoclonal antibodies. Such humanized antibodies can be 
5 prepared by known techniques and offer the advantages of 

reduced Immunogenic! ty when the antibodies are administered 
to humans. In one embodiment, a humanized monoclonal 
antibody comprises the variable region of a murine antibody 
(or just the antigen-binding site thereof) and a constant 
10 region derived from a human antibody. Alternatively, a 

humanized antibody fragment can comprise the antigen-binding 
site of a murine monoclonal antibody and a variable region 
fragment (lacking the antigen-biding site) derived from a 
human antibody. 
15 The present invention Includes antigen-binding 

antibody fragments that can be also generated by 
conventional techniques. Such fragments include, but are 
not limited to. Fab and F(ab' ) 2 as an example. Antibody 
fragments generated by genetic engineering techniques and 
20 derivatives thereof are also provided. 

In one embodiment, the antibody is specific to the 
protein of the present invention, and it does not cross - 
react with other proteins. Screening procedures by which 
such antibodies can be identified are publicly known, and 
25 may involve, for example, immunoaf f inity chromatography. 

The antibodies of the invention can be used in assays 
to detect the presence of the protein or fragments of the 
present invention, either in vitro or in vivo. The 
antibodies also can be used in purifying proteins or 
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fragments of the present invention by immunoaf f inity 
chromatography . 

Further , a binding partner such as an antibody that 
can block binding of a protein of the present invention to 
5 an acceptor substrate can be used to inhibit a biological 
activity rising from such a binding. Such a blocking 
antibody may be identified by any suitable assay procedure, 
such as by testing the antibody for the ability to inhibit 
binding of said protein to specific cells expressing the 

10 acceptor substrate. Alternatively , a blocking antibody can 
be identified in assays for the ability to inhibit a 
biological effect that results from a protein of the present 
invention binding to the binding partner of target cells. 

Such an antibody can be used in an in vitro procedure, 

15 or administered in vivo to inhibit a biological activity 
mediated by the entity that generated the antibody. 
Disorders caused or exacerbated (directly or indirectly) by 
the interaction of a protein of the present invention with a 
binding partner thus can be treated. A therapeutic method 

20 involves in vivo administration of a blocking antibody to a 
mammal in an amount effective to inhibit a binding partner- 
mediated biological activity. Monoclonal antibodies are 
generally preferred for use in such therapeutic methods. In 
one embodiment, an antigen-binding antibody fragment is used. 

25 

(7) Cancer markers and methods for detection 

The protein or nucleic acids of the present invention 
can be used as a cancer marker, and be applied to diagnosis 
and treatment of cancers and the like. As used herein, the 
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term "cancer" means typically all malignant tumors, and 
Includes disease conditions with said malignant tumors. 
"Cancer" includes, but is not limited to, lung cancer, liver 
cancer, kidney cancer and leukemia. 
5 "Cancer marker" used herein means the protein and 

nucleic acids of the present invention that express more 
than those of a non-cancerous biological sample, when a 
biological sample is cancerous. In addition, "biological 
sample" includes tissues, organs, and cells. Blood is 

10 preferable, pathological tissue is more preferable. 

Specifically, when the protein of the present 
invention is used as a cancer marker, a method for detection 
of the present invention includes the steps: (a) quantifying 
said protein in a biological sample; and (b) estimating that 

15 the biological sample is cancerous in the case that the 

quantity value of said protein in the biological sample is 
more than that in a control biological sample. In said 
method for detection, the antibody of the present invention 
can be used to quantify said protein of the biological 

20 sample. According to the present invention, generally, the 
method for qualifying the protein is not limited to the 
above methods and can use quantity methods know in the art 
such as ELISA, Western Blotting. A ratio of the quantity 
value is preferably 1.5 times or more, more preferably 3 

25 times or more, and even more preferably 10 times or more. 

On the other hand, when the nucleic acid of the 
present invention is used as a cancer marker, a method for 
detection of the present invention includes the steps of: 
(a) quantifying said nucleic acid in a biological sample; 
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and (b) estimating that the biological sample is cancerous 
in the case that the quantity value of said nucleic acid in 
the biological sample is 1.5 times or more than that of a 
control biological sample. Preferably, the steps comprise 
5 (a) hybridizing at least one of said analytical nucleic 
acids to said nucleic acid in the biological sample; (b) 
amplifying said nucleic acid; (c) hybridizing said nucleic 
acids to the amplification product; (d) quantifying a signal 
rising from said amplification product and said analytical 
10 nucleic acid hybridized; and (e) estimating that the 

biological sample is cancerous in the case that the quantity 
value of said signal is 1.5 times or more than that of a 
corresponding signal of a control biological sample. 

More specifically, as described in the example below, 
15 canceration can be estimated by determination of a ratio of 
expression level of the nucleic acids in cancerous tissue 
and normal tissue by quantitative PCR. According to the 
present invention, the quantification of the nucleic acid is 
not limited to this, and for example, RT-PCR, northern 
20 blotting, dot blotting or DNA microarray may be used. In 
such quantification, nucleic acids of genes present 
generally and broadly in same tissue and the like such as 
nucleic acids encoding glyceraldehyde- 3 -phosphate 
dehydrogenase (GAPDH) , p-actin are used as a control. A 
25 quantity ratio to be estimated as canceration is preferably 
1.5 or more, more preferably 3 or more, even more preferably 
10 or more. 

The following examples further Illustrate the present 
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invention without, however, limiting the invention thereto. 
Examples 

5 Example 1 Preparation of the human protein of the present 
invention 

1. Search through a genetic database and determination of 
the nucleic acid sequence of a novel N-acetylgalactosamine 

10 transferase 

A search of similar genes through a genetic database 
was performed by use of the genes for existing p-1,4- 
galactose transferases. The sequences used were SEQ ID NOs: 
AL161445, AF038660, AF038661, AF022367, AF038663, AF038664 
15 in the genes for p-1 , 4-galactose transferases. The search 

was performed using a program such as Blast [Altschul et al., 
J. Mol. Biol., 215, 403-410 (1990)). 

As a result, GenBank Accession No. N48738 was found 
as an EST sequence, and GenBank Accession No. AC006205 was 
20 found as a genome sequence. As a further result, it is 
considered that both sequences comprise disparate genes 
(hereinafter, the genes comprising N48738 and AC006205 refer 
to NGalNAc-Tl and NGalNAc-T2, respectively). Since the 
translation initiation sites of both genes were unknown, it 
25 was impossible to predict the full length of the genes. 

Marathon -Ready cDNA (Human Brain or Stomach) from CLONTECH 
was used for obtaining the information of coding regions (5' 
RACE: Rapid Amplification of cDNA Ends) and cloning. 
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Obtaining information of coding region o f NGalNAc-Tl 

API primer included in Marathon cDNA (a DNA fragment 
having adaptors API and AP2 at both ends) and primer K12R6 
generated within the identified sequence part (5'-GCT CCT 
5 GCA GCT CCA GCT CCA-3') (SEQ ID NO: 5) were used for PCR (30 
cycles of 94 °C for 20 seconds, 60 °C for 30 seconds and 72 
°C for 2 minutes). Further, AP2 primer included in Marathon 
cDNA and primer K12R5 generated within the identified 
sequence part (5'-AAG CGA CTC CCT CGC GCC GAG T-3') (SEQ ID 
10 NO: 6) were used for nested PCR (30 cycles of 94 °C for 20 
seconds, 60 °C for 30 seconds and 72 °C for 2 minutes). A 
fragment of about 0.6 kb obtained as a result was purified 
by a common method, and the nucleic acid sequence was 
analyzed. However, since a transmembrane sequence special 
15 to glycosyl transferases (hydrophobic 20 amino acids) could 
have appeared, an EST sequence (GenBank Accession No. 
PF058197) was discovered based on the obtained sequence and 
the nucleic acid sequence of NGalNAc-T2 described later by 
search through genome database. Based on the information of 
20 nucleic acid sequence, RT-PCR was performed using two 

primers (K12F101: 5'-ATG CCG CGG CTC CCG GTG AAG AAG-3' (SEQ 
ID NO: 7) and K12R5) and the amplification was confirmed. 
Therefore, it was explained that this EST sequence and the 
sequence obtained by 5' RACE exist on one mRNA. The full 
25 length of nucleotide sequence (3120 bp) was shown in SEQ ID 
NO: 2. 

Obtaining information of coding region of N GalNAc-T2 

API primer included in Marathon cDNA (a DNA fragment 
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having adaptors API and AP2 at both ends) and primer K13-R3 
generated within the identified sequence part (5'-CAA CAG 
TTC AAG CTC CAG GAG GTA-3' (SEQ ID NO: 8)) were used for PCR 
(30 cycles of 94 °C for 20 seconds # 60 °C for 30 seconds and 
5 72 °C for 2 minutes). Further, AP2 primer included in 
Marathon cDNA and primer K13R2 generated within the 
identified sequence part (5'-CTG ACG CTT TTC CAC GTT CAC 
AAT-3'(SEQ ID NO: 9)) were used for nested PCR (30 cycles of 
94 °C for 20 seconds , 60 °C for 30 seconds and 72 °C for 2 

10 minutes ) . A fragment of about 1.0 kb obtained as a result 
was purified by a common method, and the nucleic acid 
sequence was analyzed. Further, a coding region of a 
protein was determined. However, since a transmembrane 
sequence special to glycosyl transferases (hydrophobic 20 

15 amino acids) could have appeared, further 3 times 5' RACE 
was performed. The primers used here are shown in Table 2. 

As a result, the obtained full length of nucleotide 
sequence (2997 bp) was shown in SEQ ID NO: 4. 

20 Table 2 Various primers used in RACE 

Second 5' RACE primers 

K13 R6 5' -CAC CCC GTC TCT GCT CTG CGA T-3'(SEQ ID NO: 10) 
K13 R5 5' -GTC TTC CTG GGG CTG TCA CCA-3' (SEQ ID NO: 11) 
25 Third 5' RACE primers 

K13 R7 5' -CAC CTC ATC CAT CTG TAG GAA CGT-3'(SEQ ID NO: 12) 
K13 R8 5 '-CTG TCG CCA TGC AAC TTC CAC GT-3' (SEQ ID NO: 13) 
Fourth 5' RACE primers 

K13 R12 5'-AAT GTC GTG GTC CTC GAG GCT CA-3' (SEQ ID NO: 14) 

41 



WO 2004/016790 



r CT/JP2003/010309 



K13 Rll 5' -GAT GGT AGA ACT GGA GGT GTG GAT-3'(SEQ ID NO: 15) 

2. Integration of GalNAc-T gene into an expression vector 

To prepare an expression system of GalNAc-T, a 
portion of GalNAc-T gene was first integrated into pFLAG- 
CMV1 (Sigma). 

Integration of NGalNAc-Tl into pFLAG-CMVl 

A region corresponding to amino acids 62-1039 of SEQ 
ID NO: 1 or 2 was amplified by LA Taq DNA polymerase (Takara 
Shuzo) using Marathon cDNA (Human Brain) as a template , 
forward primer K12~Hin-F2: 5'-CCC AAG CTT CGG GGG GTC CAC 
GCT GCG CCA T-3' (SEQ ID NO: 16) , and reverse primer K12- 
Xba-Rl: 5' -GCT CTA GAC TCA AGA CGC CCC CGT GCG AGA- 3' (SEQ 
ID NO: 17), The fragment was digested at restriction sites 
(Hindlll and Xbal) included in the primers , and inserted 
into pFLAG-CMVl digested with Hind III and Xbal by use of 
Ligation High (Toyobo) to prepare pFLAG-NGalNAc-Tl . 

Integration of NGalNAc-T2 into pFLAG-CMVl 

A region corresponding to amino acids 57-998 of SEQ 
ID NO: 3 or 4 was amplified by LA Taq DNA polymerase (Takara 
Shuzo) using Marathon cDNA (Human Stomach) as a template , 
forward primer K13-Eco-Fl: 5 ' -GGA ATT CGA GGT ACG GCA GCT 
GGA GAG AA-3' (SEQ ID NO: 18), and reverse primer K13-Sal- 
Rl: 5 '-ACG CGT CGA CCT ACA GCG TCT TCA TCT GGC GA-3' (SEQ ID 
NO: 19). This fragment was digested at restriction sites 
(EcoRI and Sail) included in the primers , and inserted 
temporally into pcDNA3.1 digested with EcoRI and Sail. This 
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was digested with EcoRI and Pmel. The fragment including 
the active site of NGalNAc-T2 was inserted at the EcoRI - 
EcoRV site ofpFLAG-CMVl using Ligation High (Toyobo Co.) to 
prepare pFL AG - NGalNAc - T 2 . 

3. Transfection and expression of recombinant enzymes 

15 \tg of pFL AG - NGalNAc - T 1 or pFL AG - NGalNAc - T 2 was 
induced into 2 X 10 6 of COS-1 cells which were cultured 
overnight in DMEM (Dulbecco's modified Eagle's medium) 
including 10 % FCS (fetal calf serum), using Lipof ectamine 
2000 (Invitrogen Co.) as a protocol provided by the same 
company. A supernatant of 48-72 hours was collected. The 
supernatant was mixed with NaN 3 (0.05 %), NaCl (150 mM) , 
CaCl 2 (2 mM) and an anti-Mi resin (Sigma Co.) (50 , and 
the mixture was stirred overnight at 4 °C. The solution of 
reaction mixture was centrlfuged (3000 rpm, 5 min, 4 °C) to 
collect a pellet. The pellet was combined with 900 \il of 2 
mM CaCl 2 /TBS and re- centrlfuged (2000 rpm, 5 min, 4 °C), 
after which the pellet was suspended in 200 jaI of 1 mM 
CaCl 2 /TBS to give a sample for assaying activity ( NGalNAc - T 1 
or NGalNAc - T 2 enzyme solution) . 

The enzyme was subjected to conventional SDS-PAGE and 
Western blotting, and the expression of the intended protein 
was confirmed. Anti FLAG M2-peroxydase (A-8592, SIGMA Co.) 
was used as an antibody. 

Example 2 Assay of activity using the enzyme of the present 
invention 
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1. Search for donor substrates 

A search for a donor substrate of the enzyme of the 
present Invention was performed on various mono -saccharide 
acceptor substrates, using 5 ml of enzyme solution and 
5 various acceptor substrates . 

The acceptor substrates were prepared so that each of 
Gal-a-pNp, Gal-p-oNp, GalNAc-a-Bz, GalNAc-p-pNp, GlcNAc-a- 
pNp. GlcNAc-p-pNp, Glc-a-pNp, Glc-p-pNp, GlcA-p-pNp. Fuc-a- 
pNp, Man-a-pNp (thereinbefore, CALBIOCHEM Co.). Xyl-a-pNp, 
10 Xyl-p-pNp (thereinbefore, SIGMA Co.) was included in 2.5 
nmol/20 Further, the solutions of various donor 

substrates (UDP-GalNAc, UDP-GlcNAc, UDP-Gal, GDP -Man, UDP- 
GlcA, UDP-Xyl and GDP-Fuc, thereinbefore, SIGMA Co.) are 
shown in Table 3. 

15 

Table 3 

QalNAc-T 

MES or HEPES (pH 5.5 - 50 mM 
UDP-GalNAc 0.5 mM 
UDP-[14C]Ga1NAc 2 nCi/ul 
MnCI2 20 mM 
Triron X-100 05X 



GlcNAe-T 

HEPES (pH 7.0 or 7.5) 14 mM 

UDP-GlcNAc 0.5 mM 

UDP-[14C]GlcNAo 2 nCi/ul 

MnCI2 10 mM 

Triron CF-54 OM 
ATP 0.75 mM 

Gat-T 

HEPES (pH 7.0 or 7.5) 1 4 mM 

UDP-Gal 0.25 mM 
UDP-[14C]Gal 2.5 nCi/ul 
MnCI2 10 mM 

ATP 0.75 mM 



QleA-T 



MES (pH 7.0) 
UDP-GIcA 
UDP-£l40]GlcA 
MnCI2 




50 mM 
025 mM 
2 nCi/ul 

10 mM 


XvhT 


MES (pH 7.0) 
UDP-Xyl 
UDP-[14C]Xyl 
MnC12 




50 mM 
025 mM 
1 nCi/ul 

10 mM 


Fuc-T 


cacodylate buffer (pH 7.0! 

GDP-[140]Fuc 

MnCI2 

ATP 


50 mM 
1 nCl/ul 
10 mM 
5 mM 


Man-T 


Tris (pH 7.2) 

GDP-[14C]Man 

MnCI2 

Triton X-100 




60 mM 
2 nCi/ul 
10 mM 
0.6X 
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All of reaction times were 16 hours. After reaction, 
non-reactive acceptor substrates with radioactivity were 
removed with SepPack C18 column (Waters CO.), and 
radioactivity from donor substrates integrated into acceptor 
substrates was determined with a liquid scintillation 
counter. Consequently, there appeared little background 
even in UDP-GlcA using each of NGalNAc-Tl and NGalNAc-T2, 
however, the highest activity was detected in the case of 
UDP-GalNAc as a donor substrate. 



2. Search for acceptor substrates 

Further, in order to investigate acceptors, reactions 
were performed using each acceptor (10 nmol/20 |xl) by itself. 
As a result, significant radioactivity was detected in the 
case of GlcNAc-p-pNp (NGalNAc-Tl: 256.26 dpm, NGalNAc-T2: 
1221.22 dpm). Based on the above results, it was explained 
that both of NGalNAc-Tl and NGalNAc-T2 are glycosyl 
transferases capable of transferring GalNAc to GlcNAc-T. 



3. Study of optimum pH 

As described above, it was explained that NGalNAc-Tl 
and NGalNAc-T2 are glycosyl transferases which transfer 
GalNAc to GlcNAc. Thereat, the optimum pH of both enzymes 
was studied. The buffer solutions used are MES (pH 5.5, 6.0, 
6.26, 6.5, 6.75), HEPES (pH 6.75, 7.0, 7.4). As a result, 
as shown in Table 4, the activity tends to be higher in pH 
6.5 of MES buffer for both NGalNAc-Tl and NGalNAc-T2 . 



45 



WO 2004/016790 




CT/JP2003/010309 



Table 4 A result of optimum pH in enzymatic activity of 
NGalNAc -Tl and NGalNAc -T 2 



NGalNAc -Tl 



pH 


Incorporation of 
radioactivity (A) 


Blank (B) 


(A) - (B) 


MES buffer (pH 5.5) 


339.76 


263.21 


76.55 


MES buffer (pH 6.0) 


321.04 


263.21 


57.83 


MES buffer (pH 6.26) 


636.34 


263.21 


373.13 


MES buffer (pH 6.5) 


1767.72 


263.21 


1504.51 


MES buffer (pH 6.75) 


923.92 


263.21 


660.71 


HEPES buffer (pH 6.75) 


1685.06 


263.21 


1421.85 


HEPES buffer (pH 7.0) 


1138.38 


263.21 


875.17 


HEPES buffer (pH 7.4) 


2587.48 


263.21 


2324.27 



(dpm) 



NGalNAc - T2 



PH 


Incorporation of 
radioactivity (A) 


Blank (B) 


(A) - (B) 


MES buffer (pH 5.5) 


336.20 


263.21 


72.99 


MES buffer (pH 6.0) 


341.92 


263.21 


78.71 


MES buffer (pH 6.26) 


339.50 


263.21 


76.29 


MES buffer (pH 6.5) 


753.62 


263.21 


490.05 


MES buffer (pH 6.75) 


529.24 


263.21 


266.03 


HEPES buffer (pH 6.75) 


915.16 


263.21 


651.95 


HEPES buffer (pH 7.0) 


786.70 


263.21 


523.49 


HEPES buffer (pH 7.4) 


586.32 


263.21 


323.11 



(dpm) 



In addition, the value (263.21 dpm) of MES (pH 6.75) 
was adopted as a blank value in the case of a non-enzyme. 
Further, when pH of HEPES buffer was 7.4 for NGalNAc -Tl and 
6.75 for NGalNAc-T2, the highest value was shown. However, 
the activity did not always increase even when pH increase. 
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Hereinafter, MES (pH 6.5) was used in each of experiments. 

4. Studying requirements of divalent cations 

Generally, glycosyl transferases require frequently 
5 divalent cations. The activity of each enzyme was studied 

by adding various divalent cations. Consequently, the high 

values were represented when Mn 2+ in NGalNAc-Tl, and Mg 2+ , 

Mn 2+ and Co 2+ in NGalNAc - T 2 were added (see Table 5). 

Regarding this, both enzymes showed the activity due to 
10 adding EDTA which is a chelating agent. From the above 

results, it was explained that both enzymes require divalent 

cations . 

Table 5 A result of requirements of divalent catio ns in the 
15 activity of NGalNAc -Tl and NGalNAc -T 2 



NGalNAc -Tl 



Divalent cations etc. 


Incorporation of 
radioactivity (A) 


Blank (B) 


(A) - (B) 


MnCl 2 


519-47 


263.21 


256.26 


MgCl 2 


256.36 


263.21 


-6.85 


ZnCl 2 


210.29 


263.21 


-52.92 


CaCl 2 


230.78 


263.21 


-32.43 


CuCl 2 


278.77 


263.21 


15.56 


CoCl 2 


240.91 


263.21 


-22.30 


CdS0 4 


203.39 


263.21 


-59.82 


EDTA 


242.38 


263.21 


-20.83 



(dpm) 



20 
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NGalNAc-T2 



PH 


xncorporaxxoii 01 

r,qpf ivitv ( A ^ 

i. CLUXUa^ LX V J. LJf \ ** / 


Blank (B) 


(A) - (B) 


MnCl 2 


1 ARA 4*3 


263.21 


1221.22 


MgCl 2 


ji^t ■ j. \j 


263.21 


2860.95 


ZnCl 2 


187.59 


263.21 


-75.62 


CaCl 2 


217.83 


263.21 


-45.38 


CuCl 2 


218.35 


263.21 


-44.86 


CoCl 2 


1130.63 


263.21 


867.42 


CdS0 4 


217.92 


263.21 


-45.29 


EDTA 


235.28 


263.21 


-27.93 



(dpm) 



Example 3 Expression analysis in various human tissues 

The expression levels of. said gene was quantified by 
quantitative PCR using cDNA of normal human tissues. The 
cDNA of normal tissues which was reversely transcribed from 
total RNA (CLONETECH Co.) was used. As for cell lines, 
total RNA therefrom was extracted, and cDNA was prepared by 
conventional methods and was used. The quantitative 
expression analysis of NGalNAc-Tl was performed using 
primers: K12-F3 (5'-ctg gtg gat ttc gag age ga-3' (SEQ ID 
NO: 20)) and K12-R3 (5'-tgc cgt cca gga tgt tgg-3' (SEQ ID 
NO: 21)), and probe: K12-MGB3 (5'-gcg gta gag gac gcc-3' 
(SEQ ID NO: 22)). The quantitative expression analysis of 
NGalNAc - T 2 was performed using primers: K13-F3 (5' -ate gtc 
ate act gac tat age agt ga-3' (SEQ ID NO: 23)) and K13-R3 
(5'-gaa tgg cat cga tga etc cag-3' (SEQ ID NO: 24)), and 
probe: K13-MGB3 (5' -etc gtg aag gac ccg ca-3' (SEQ ID NO: 
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25)). A prove with a minor groove binder (Applied 
Biosystems Co.) was used. Universal PCR Master Mix was used 
as enzyme and reaction solution, and 25 ml of the reaction 
solution was quantified with ABI PRISM 7700 Sequence 

5 Detection System ( together , Applied Biosystems Co.). 

Glycer aldehyde- 3 -phosphate dehydrogenase (GAPDH) was used as 
a standard gene for quantification. A calibration curve for 
quantification was made by using a template DNA at a known 
concentration, and the expression level of said gene was 
10 normalized. Further, pFL AG - NGalNAc - T 1 and pFL AG - NGalNAc - T 2 
were used as standard DNAs of NGalNAc -Tl and NGalNAc -T2 . 
The reaction temperature was 50 °C for 2 min, 95 °C for 10 
min, followed by 50 cycles of 95 °C for 15 sec, 60 °C for 1 
min. The result is shown in Figure 1. It was explained 

15 that the amounts of expressions of NGalNAc -Tl and NGalNAc -T2 
were high in the nervous system, stomach and spermary, 
respectively . 

Example 4 Expression analysis of human cancerous tissue 

20 

The expression levels of both genes of human lung 
cancerous tissue and normal lung tissue in the same patient 
were analyzed. The methods were the same as that of Example 
3, provided that b-actin gene was used as a control gene, 
25 and Pre -Developed TaqMan Assay Reagents Endogenous Control 
Human Beta-actin (Applied Biosystems Co.) was used in the 
quantification (Figure 2). Consequently, it was explained 
that both genes can be used at least as a lung cancer marker. 
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Example 5 Assay for acceptor substrates of glycosyl- 
transferase activities 

For the reaction of GalNAc-T assay. 50 mM MES buffer 
5 (pH 6.5) containing 0.1 % triton X-100, 1 mM UDP-GalNAc, 10 
mM MnCl 2 and 500 uM each acceptor substrate was used. A 10 
ul of enzyme solution for 20 ul of each reaction mixture 
were added and incubated at 37 °C for various periods. 
After the incubation the mixture was filtrated with 
10 Ultrafree-MC column (Millipore, Bedford, MA), and 10 ul 
aliquot was subjected to reversed-phase high performance 
liquid chromatography (HPLC) on an ODS-80TS QA column (4.6 x 
250 mm; Tosoh, Tokyo, Japan). A 0.1 % TFA/H 2 0 with 12 % 
acetonitrile was used as a running solution. An ultraviolet 
15 spectrophotometer (absorbance at 210 nm) , SPD-IOAvp (Shimazu, 
Kyoto, Japan) was used for detection of the peaks. When the 
pyridyl amino-labeled oligosaccharides were utilized as 
acceptor substrates, 50 nM substrates were added into the 
reaction mixtures. For the analyses of the products derived 
20 from pyridyl amino labeled oligosaccharides, 100 mM acetic 
acid/trlethylamine (pH4.0) was used as a running solution 
and the products were eluted with a 30-70% gradient of 1% 1- 
butanol in running solution at a flow rate of 1.0 ml/min at 
55 "C. 

25 A 200 H9 of ttie reaction product was dissolved in 150 

ul of D 2 0 using a micro cell and used as a sample for X H NMR 
experiments. One -dimensional and two-dimensional l H NMR 
spectra were recorded with DMX750 (Bruker, Germany, 750.13 
MHz for Hi nucleus) and ECA800 (JEOL, Tokyo, Japan, 800.14 
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MHz for X H nucleus) spectrometers at 25 °C. Methylene proton 
of benzyl group in higher field (4.576 ppm) was used as a 
reference for the l H NMR chemical shifts tentatively. 



5 substrates, N- and O-glycans containing GlcNAc on their non- 
reducing termini were utilized. As shown in Table 6 and 7, 
all acceptor substrates examined could receive a GalNAc 
residue. 

10 Table 6 



To investigate the specificity for acceptor 



Substrate specificity of NGalNAc-Ts 



Relative activity (%) 



Acceptor substrate 



NGalNAc-Tl 



NGalNAc-T2 



1. GlcNAcp-Bz 

2. GlcNAcpl-6(Galpl-3)GalNAca-pNp 



100 

15.2 



100 
11.4 



(core2-pNp) 

3. GlcNAcpi-3GalNAca-j?Np (core3-pNp) 

4. GlcNAcpl-6GalNAca-/»Np(core6^Np) 



20.0 
190.7 



323 
220.4 



15 
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Table 7 

Substrate specificity of NGalNAc-Ts 



Relative activity (%) 

Acceptor substrate 

NGalNAc-Tl NGalNAc*T2 



1. 


GlcNAcp l-2ManaW 

2Manpi-4GlcNAcpi-4 GlcNAc-PA 
GlcNAcp l-2Manctl^ 


100 


100 


z. 


Fucal 

GlcNAcp l-2Manal-^, 5 
GleNAcpi^al^^ 1 - 401 ^ 1 - 4 GlcNAc-PA 


76.8 


87.1 


3. 


Gaipi-4GlcNAcpi-2Manal^ 

n< Mjr 5Manpi-4GlcNAc01-4 GlcNAc-PA 
GlcNAcpi-2Manal^ 


26.2 


45.0 


4. 


Fucal 

Gaipi^GlcNAcpi-2Manal^, fe 

SManpi-4GlcNAcpl-4 GlcNAc-PA 
GlcNAcpi-2Manal^ 


26.7 


51.7 


5. 


GlcNAcp l-2Manal^ 
Gaip 1 -4GlcNAc P l-2Mancxl''3 M -Pl- 4 G 1 ^^ GlcNAc-PA 


16.2 


21.6 


6. 


Fucal 

GlcNAcp l-2Manca^ \ 
Gaipi^GlcNAcpi^Manal^^ 1 - 401 ^ 1 - 401 ^^ 


3.4 


5.0 




X H NMR spectroscopy was performed to 


determine 


the 



newly formed glycosidic linkage of NGalNAc-T2 product. 
5 One -dimensional X H NMR spectrum of the NGalNAc-T2 product is 
shown in Fig. 5. In the NMR spectra, signal integrals (not 
shown, five phenyl protons of Bz, two methylene protons of 
Bz, two anomerlc protons, twelve sugar protons except 
anomeric protons, six methyl protons of two N-acetyl groups) 

10 were in good correspondence with the structure of 

GalNAc-GlcNAc-O-Bz. As shown in Pig. 5 and in Table 8, two 
anomeric protons revealed resonances at very close magnetic 
field with coupling constant (J 1#2 ) larger than 8 Hz. This 
indicates that two pyranoses in the samples are in 

15 (3-gluco-configuration. All X H signals could be assigned 

after high resolutional detections of COSY, TOCSY and NOESY 
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experiments. The anomeric resonance in the lower field 
showed NOE with two methylene protons of benzyl group in the 
sample (not shown) , on the other hand, the anomeric 
resonance in higher field did not show NOE with methylene 
5 protons (not shown) . The facts mean that the anomeric 

resonance in the lower field is responsible for the anomeric 
proton of the substrate pyranose (p-GlcNAc, defined as A), 
and that the anomeric proton in the higher field corresponds 
to anomeric proton of the transferred pyranose ((3-GalNAc, 

10 defined as B). The chemical shifts and coupling constants 
of sugar part of the sample were shown in Table 8. The 
chemical shift and signal splitting of B-4 resonance was 
characteristic in (5-Gal configuration [see Reference 15], 
and the order in chemical shift of A1-A6 protons was 

15 characteristically similar to observed spectrum of (3-GlcNAc 
in LNnT (Galpl-4GlcNAcpi-3Gaipi-4Glc) . As shown in Fig. 6, 
weak NOE cross peak between Bl and A4 and very weak NOE 
cross peaks between Bl and two A6 were observed in addition 
to strong inner residual NOEs between Bl and B5 and between 

20 Al and A5. These suggest the existence of pl-4 linkage 
between two pyranoses. Results in NMR experiments thus 
indicated clearly that the product by NGalNAc-T2 is 
GalNAc01-4GlcNAc-O-Bz . 

25 



53 



WO 2004/016790 ^FCT/JP2003/010309 



Table 8 



Chemical shifts (ppm) and coupling constants (Hz) of 
sugar CH protons in the NGaINAc-T2 product 

NGalNAc-T2 product 
GlcNAc GalNAc 

*H Chemical shifts (ppnif 

81 4.434 4.425 

52 3.647 3.831 

63 3.546 3.665 

54 3.534 3.846 

55 3.411 3.628 

56 3.589 3.696 
56 3.782 3.680 
6CH3 1.830 1.987 

Coupling constants (Hz) 

J„ S3 8.4 

10.8 
<3.7 

J 5M 5.6 <3.7 

Jgg ^± 

a, The chemical shifts were set as the higher field 
signal of the benzyl methylene protons is ppm 
tentatively. 



5 Example 6 LacdiNAc synthesizing activity of NGalNAc-T2 
toward asialo/agalacto-f etal calf fetuin 

As demonstrated in Table 6 and 7, both NGalNAc~Tl and 
-T2 transferred GalNAc toward both O- and N-glycans 

10 substrates. The LacdiNAc ( GalNAc j5l-4GlcNAc) structures have 
been found in N-glycans of some glycoproteins in human. 
Therefore , to determine the activity of NGalNAc-T2 to 
transfer GalNAc to a glycoprotein , fetal calf fetuin (FCF) , 
which has both N- and O-glycans, was utilized as an acceptor 

15 substrate. 
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Fetal calf fetuln (FCF), neuraminidase, pl-4 
galactosidase and glycopeptidase F were purchased from Sigma, 
Nacalai Tesque (Kyoto, Japan), Calbiochem and Takara, 
respectively, Asialo/agalacto-FCF was prepared from 200 \ig 
5 of FCF by incubating with 4 \jl\J of neuraminidase and 12 [iU of 
pi, 4 -galactosidase at 37 ° C for 16 hr. The transfer of 
GalNAc by GalNAc -T2 to glycoprotein was performed in 20 
of a standard reaction mixture containing 50 \ig of 
asialo/agalacto-FCF produced by glycosidase treatment. 
10 After the incubation at 37° C for 16 hr, each 5 y.1 of the 
reaction mixture was digested with glycopeptidase F (GPF) 
according to manufacture's instruction. For detection of 
transferred GalNAc, horseradish peroxidase (HRP) conjugated 
lectin. Wisteria floribunda agglutinin (WFA) (EY 
15 Laboratories, San Mateo, CA) , was used. A 1 \il of reaction 
mixtures subjected to 12.5% SDS-PAGE were transferred to 
nitrocellulose membrane (Schleicher & Schuell, Keene, NH) 
and stained with 0.1% HRP conjugated WFA lectin. The 
signals were detected using enhanced chemiluminescence (ECL) 
20 and Hyperfilm ECL (Amersham Biosciences). 

As shown in Fig. 3, asialo/agalacto-FCF appeared as 
approximately 55 and 60 kDa band (lane 1). NGalNAc-T2 
effectively transferred GalNAc to asialo/agalacto-FCF (lane 
5). Furthermore, the band mostly disappeared by a GPF 
25 treatment, and its molecular size was detected at 

approximately 45 and 50 kDa position by Coomassle staining 
(Fig. 3, lane 3 and 6). In the case of NGalNAc-Tl, the 
activity toward asialo/agalacto-FCF was same as NGalNAc-T2 
(data not shown). 
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Example 7 Analysis of N-glycan structures on glycodelln 
from NGalNAc-Tl and -T2 gene transfected CHO cells 

5 As shown above, both NGalNAc-Tl and -T2 could 

synthesize LacdiNAc structures on mono- and oligosaccharide 
acceptors. Actually, it is known that the LacdiNAc 
structures exist in N-glycans on some glycoproteins. 
Therefore we examined the ability of NGalNAc-Tl to construct 

10 LacdiNAc on glycodelin, which is one of major glycoproteins 
carrying LacdiNAc structures, in vivo. CHO cells were 
employed for this purpose, because glycodelin produced in 
CHO cells is devoid of any of the LacdiNAc-based chains. 

The glycodelin expression vector was transfected into 

15 CHO cells expressing NGalNAc-Tl or -T2 gene and the culture 
medium was collected from 48 hr-culture medium. Glycodelin 
was harvested with WFA affinity column from the culture 
medium. The harvested glycodelin was applied to SDS-PAGE 
and used for lectin blotting with WFA. 

20 As shown in Fig. 7, the non-reducing terminal GalNAc 

was detected only when NGalNAc-Tl or -T2 gene was co- 
transfected with glycodelin gene. These bands were 
disappeared by N-glycanase™ treatment, therefore these 
GalNAc residues might exist in N-glycans. 

25 

Example 8 Preparation of mouse proteins of the present 
invention 

1. Search through a genetic database and determination of 
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the nucleic acid sequence of a novel mouse 
N-acetylgalactosaminyltransf erase 

A search of similar genes through a mouse genomic 
database (UCSC Human Genome Project, Nov. 2001 mouse 

5 assembly archived Sep. 15, 2002, 

http://genome-archlve.cse.ucsc.edu/) was performed by use of 
the genes for existing human NGalNAc-Tl and -T2. The 
sequences used were SEQ ID NOs: 1, 3, 26 and 28. The search 
was performed using a program such as Blast [Altschul et 

10 al., J. Mol. Biol.. 215, 403-410 (1990)]. 

As a result, two homologous genes were found on 
mouse chromosome 7 and 6. The nucleotide and amino acid 
sequences of the first gene on chromosome 7, which is an 
ortholog of human NGalNAc-Tl, were shown as SEQ ID NOs: 26 

15 and 28. The second ones on chromosome 6 were described as 
SEQ ID NOs: 27 and 29. 

2. Integration of GalNAc-T genes into an expression vector 
To prepare each expression system of mouse 
20 NGalNAc-T, a portion of each gene was first integrated into 
pFLAG-CMVl (Sigma). 

Integration of mNGalNAc - T 1 into pFLAQ-CMAVl 

The mouse NGalNAc - T 2 ( mNGalNAc - T 2 ) gene encoding its 
25 putative catalytic domain (amino acid 45 to 1,034) was 

amplified with two primers, 5'-CCC AAG CTT CGC CTG GGC TAC 
GGG CGA GAT- 3' (SEQ ID NO: 31) and 5'-GCT CTA GAC TCA GGA 
TCG CTG TGC GCG GGC A- 3' (SEQ ID NO: 32), using the cDNA 
derived from mouse brain as a template. The mRNA was 
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prepared from mouse brain with RNeasy mini kit (Qiagen) , 
then the cDNA was synthesized with Superscript first-strand 
synthesis system for RT-PCR ( Invitrogen) . For the PCR, LA 
Taq DNA polymerase (Takara) was used. The amplified 2.7 kb 
fragment was digested with endonuclease Hind III and Xba I, 
then the digested fragment was inserted into pFLAG-CMV-l 
and pFL AG -mNGalNAc - T 1 was constructed. 

Integration of mNGalNAc - T 2 into pFLAG-CMAVl 

The mouse NGalNAc-T2 (mNGalNAc-T2 ) gene encoding its 
putative catalytic domain (amino acid 57 to 986) was 
amplified with two primers , 5 ' -CCC AAG CTT CGG CCC AGG CCG 
GCG GGA ACC-3' (SEQ ID NO: 33) and 5'-GGA ATT CTC ACG GCA 
TCT TCA TTT GGC GA-3' (SEQ ID NO: 34) , using the cDNA 
derived from mouse stomach as a template. The mRNA was 
prepared from mouse stomach with RNeasy mini kit (Qiagen), 
then the cDNA was synthesized with Superscript f irst-strand 
synthesis system for RT-PCR (Invitrogen). For the PCR, LA 
Taq DNA polymerase (Takara) was used. The amplified 2.7 kb 
fragment was digested with endonuclease Hind III and EcoR 
I, then the digested fragment was inserted into pFLAG-CMV-1 
and pFL AG - mNGalNAc - T 2 was constructed. 

3. Transfection and expression of recombinant enzymes 

A 15 \ig of pFL AG -mNGalNAc - T 1 or pFLAG-mNGalNAc-T2 
was induced into 2 X 10 6 of HEK293T cells which were 
cultured overnight in DMEM (Dulbecco's modified Eagle's 
medium) including 10 % FCS (fetal calf serum), using 
Lipofectamine 2000 (Invitrogen Co.) as a protocol provided 
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by the same company. A supernatant of 48-72 hors was 
collected. The supernatant was mixed with NaN 3 (0.05 %) # 
NaCl (150 mM, CaCl 2 (2 niM) and an anti-Mi resin (Sigma Co.) 
(50 jxl) , and the mixture was stirred overnight (3000 rpm, 5 
5 min, 4 °C) to collect a pellet. The pellet was combined 

with 900 |Al of 2 mM CaCl 2 /TBS and re-centrifuged (2000 rpm, 
5 min # 4 °C), after which the pellet was suspended in 200 |xl 
of 1 mM CaCl 2 /TBS to give a sample for assaying activity 
(mNGalNAc-Tl or mNGalNAc - T 2 enzyme solution) . 
10 The enzyme was subjected to conventional SDS-PAGE and 

Western blotting, and the expression of the intended protein 
was confirmed. Anti-FLAG M2-peroxydase (A-8592, SIGAIA Co.) 
was used as an antibody. 
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Industrial Applicability 

According to the present invention, an enzyme which 
5 transfers N- acetylgalactosamine to N- ace tylglucos amine via a 
pi -4 linkage was isolated and the structure of its gene was 
explained. This led to the production of said enzyme or the 
like by genetic engineering techniques, the production of 
oligosaccharides using said enzyme, and the diagnosis of 
10 diseases on the basis of said gene or the like. 
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CLAIMS 



1. A isolated protein having an amino acid sequence 
which is selected from a group consisting of SEQ ID NOs: 1, 
3, 26 and 28 or a variant of said amino acid sequence, 
wherein one or more amino acids are substituted or deleted, 
or one or more amino acids are inserted or added, having the 
activity of transferring N- acetylgalactosamine to N- 
acetylglucosamine via a pi-4 linkage. 

2. The protein of Claim 1, wherein the amino acid 
sequence is shown in SEQ ID NO: 1 or 3. 

3. The protein of Claim 1, wherein the amino acid 
sequence is shown in SEQ ID NO: 26 or 28. 

4. The protein of Claim 1 having an identity of 50 % or 
more to the amino acid sequence shown in SEQ ID NO: 1 or 26. 

5. The protein of Claim 1 having an identity of 60 %!or 
more to the amino acid sequence shown in SEQ ID NO: 1 or 26. 

6. A isolated nucleic acid encoding the protein of any 
one of Claims 1 to 5. 

7. A nucleic acid encoding the protein of Claim 1 or 2, 
which hybridizes with a nucleic acid having the nucleotide 
sequence shown in SEQ ID NO: 2 or 4 under stringent 
conditions . 

8 . A nucleic acid encoding the protein of Claim 1 or 3 , 
which hybridizes with a nucleic acid having the nucleotide 
sequence shown in SEQ ID NO: 27 or 29 under stringent 
conditions . 

9 . The nucleic acid of Claim 7 having a nucleotide 
sequence represented by nucleotides 1-3120 of the nucleic 
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acid sequence shown in SEQ ID NO: 2 or nucleotides 1-2997 of 
the nucleic acid sequence shown in SEQ ID NO: 4. 

10. The nucleic acid of Claim 8 having a nucleotide 
sequence represented by nucleotides 1-3105 of the nucleic 

5 acid sequence shown in SEQ ID NO: 27 or nucleotides 1-2961 
of the nucleic acid sequence shown in SEQ ID NO: 29. 

11. A recombinant vector containing the nucleic acid of 
any one of Claims 6 to 10 and being capable of expressing 
said nucleic acid in a host cell. 

10 12. A host cell transformed with the recombinant vector 

of Claim 11. 

13. An analytical nucleic acid, which hybridizes to the 
nucleic acid of Claim 6 under stringent conditions. 

14. The analytical nucleic acid of Claim 13 , which is 
15 used as a primer and is selected from a group consisting of 

SEQ ID NOs: 20, 21, 23 and 24. 

15. The analytical nucleic acid of Claim 13, which is 
used as a probe and is SEQ ID NO: 22 or 25. 

16. The analytical nucleic acid of Claim 13, which is 
20 used as a cancer marker. 

17. An assay kit comprising the analytical nucleic acid 
of any one of Claims 14 to 16 and assay instructions. 

18. An antibody binding to the protein of any one of 
Claims 1 to 5. 

25 19. The antibody of Claim 18, which is an monoclonal 

antibody. 

20. A method for determining a canceration of a 
biological sample comprising the steps of: 

(a) quantifying the protein of any one of Claims 1 to 5 in 
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the biological sample; and 

(b) estimating that the biological sample is cancerous in a 
case that the quantity value of said protein in the 
biological sample is more than that in a control biological 
5 sample. 

21. The method of Claim 20, wherein said protein is 
quantified by use of the antibody of Claims 18 or 19. 

22. A method for determining a canceration of a 
biological sample comprising the steps of: 

10 (a) quantifying the nucleic acid of Claim 6 in the 
biological sample; and 

(b) estimating that the biological sample is cancerous in a 
case that the quantity value of the nucleic acid of Claim 6 
in the biological sample is 1.5 times or more than that in a 

15 control biological sample. 

23. The method of Claim 22, comprising the steps of: 
(a) hybridizing at least one of the analytical nucleic 
acids of Claim 13 to the nucleic acid of Claim 6 in the 
biological sample; 

20 (b) amplifying the nucleic acid of Claim 6; 

(c) hybridizing the analytical nucleic acids of Claim 13 to 
the amplification product; 

(d) quantifying a signal rising from said amplification 
product and said analytical nucleic acid hybridized; and 

25 (e) estimating that the biological sample is cancerous in 
the case that the quantity value of said signal is 1.5 times 
or more than that of a corresponding signal of a control 
biological sample. 
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SEQUENCE LISTING 

<110> NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY 
AMERSHAM BIOSCIENCES KK 

<120> Novel acetylgalactosamine transferases and nucleic acids encoding 



<130> YCT-860 

<150> JP2002-236292 
<151> 2002-08-14 

<160> 34 

<210> 1 

<211> 1039 

<212> PRT 

<213> Homo sapiens 

<400> 1 

Met Pro Arg Leu Pro Val Lys Lys He Arg Lys Gin Met Lys Leu Leu 
1 5 10 15 

Leu Leu Leu Leu Leu Leu Ser Cys Ala Ala Trp Leu Thr Tyr Val His 



Leu Gly Leu Val Arg Gin Gly Arg Ala Leu Arg Gin Arg Leu Gly Tyr 



the same 



20 



25 



30 



35 



40 



45 
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Gly Arg Asp Gly Glu Lys Leu Thr Ser Glu Thr Asp Gly Arg Gly Val 
50 55 60 

His Ala Ala Pro Ser Thr Gin Arg Ala Glu Asp Ser Ser Glu Ser Arg 
65 70 75 80 

Glu Glu Glu Gin Ala Pro Glu Gly Arg Asp Leu Asp Met Leu Phe Pro 
85 90 95 

Gly Gly Ala Gly Arg Leu Pro Leu Asn Phe Thr His Gin Thr Pro Pro 
100 105 110 

Trp Arg Glu Glu Tyr Lys Gly Gin Val Asn Leu His Val Phe Glu Asp 
115 120 125 

Trp Cys Gly Gly Ala Val Gly His Leu Arg Arg Asn Leu His Phe Pro 
130 135 140 

Leu Phe Pro His Thr Arg Thr Thr Val Lys Lys Leu Ala Val Ser Pro 
145 150 155 160 

Lys Trp Lys Asn Tyr Gly Leu Arg He Phe Gly Phe He His Pro Ala 
165 170 175 

Arg Asp Gly Asp Val Gin Phe Ser Val Ala Ser Asp Asp Asn Ser Glu 
180 185 190 

Phe Trp Leu Ser Leu Asp Glu Ser Pro Ala Ala Ala Gin Leu Val Ala 
195 200 205 
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Phe Val Gly Lys Thr Gly Ser Glu Trp Thr Ala Pro Gly Glu Phe Thr 
210 215 220 

Lys Phe Ser Ser Gin Val Ser Lys Pro Arg Arg Leu Met Ala Ser Arg 
225 230 235 240 

Arg Tyr Tyr Phe Glu Leu Leu His Lys Gin Asp Asp Arg Gly Ser Asp 
245 250 255 

His Val Glu Val Gly Trp Arg Ala Phe Leu Pro Gly Leu Lys Phe Glu 
260 265 270 

Val He Ser Ser Ala His He Ser Leu Tyr Thr Asp Glu Ser Ala Leu 
275 280 285 

Lys Met Asp His Val Ala His Val Pro Gin Ser Pro Ala Ser His Val 
290 295 300 

Gly Gly Arg Pro Pro Gin Glu Glu Thr Ser Ala Asp Met Leu Arg Pro 
305 310 315 320 

Asp Pro Arg Asp Thr Phe Phe Leu Thr Pro Arg Met Glu Ser Ser Ser 
325 330 335 

Leu Glu Asn Val Leu Glu Pro Cys Ala Tyr Ala Pro Thr Tyr Val Val 
340 345 350 

Lys Asp Phe Pro lie Ala Arg Tyr Gin Gly Leu Gin Phe Val Tyr Leu 
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355 360 365 

Ser Phe Val Tyr Pro Asn Asp Tyr Thr Arg Leu Thr His Met Glu Thr 
370 375 380 

Asp Asn Lys Cys Phe Tyr Arg Glu Ser Pro Leu Tyr Leu Glu Arg Phe 
385 390 395 400 

Gly Phe Tyr Lys Tyr Met Lys Met Asp Lys Glu Glu Gly Asp Glu Asp 
405 410 415 

Glu Glu Asp Glu Val Gin Arg Arg Ala Phe Leu Phe Leu Asn Pro Asp 
420 425 430 

Asp Phe Leu Asp Asp Glu Asp Glu Gly Glu Leu Leu Asp Ser Leu Glu 
435 440 445 

Pro Thr Glu Ala Ala Pro Pro Arg Ser Gly Pro Gin Ser Pro Ala Pro 
450 455 460 

Ala Ala Pro Ala Gin Pro Gly Ala Thr Leu Ala Pro Pro Thr Pro Pro 
465 470 475 480 

Arg Pro Arg Asp Gly Gly Thr Pro Arg His Ser Arg Ala Leu Ser Trp 
485 490 495 

Ala Ala Arg Ala Ala Arg Pro Leu Pro Leu Phe Leu Gly Arg Ala Pro 
500 505 510 
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Pro Pro Arg Pro Ala Val Glu Gin Pro Pro Pro Lys Val Tyr Val Thr 
515 520 525 

Arg Val Arg Pro Gly Gin Arg Ala Ser Pro Arg Ala Pro Ala Pro Arg 
530 535 540 

Ala Pro Trp Pro Pro Phe Pro Gly Val Phe Leu His Pro Arg Pro Leu 
545 550 555 560 

Pro Arg Val Gin Leu Arg Ala Pro Pro Arg Pro Pro Arg Pro His Gly 
565 570 575 

Arg Arg Thr Gly Gly Pro Gin Ala Thr Gin Pro Arg Pro Pro Ala Arg 
580 585 590 

Ala Gin Ala Thr Gin Gly Gly Arg Glu Gly Gin Ala Arg Thr Leu Gly 
595 600 605 

Pro Ala Ala Pro Thr Val Asp Ser Asn Leu Ser Ser Glu Ala Arg Pro 
610 615 620 

Val Thr Ser Phe Leu Ser Leu Ser Gin Val Ser Gly Pro Gin Leu Pro 
625 630 635 640 

Gly Glu Gly Glu Glu Glu Glu Glu Gly Glu Asp Asp Gly Ala Pro Gly 
645 650 655 

Asp Glu Ala Ala Ser Glu Asp Ser Glu Glu Ala Ala Gly Pro Ala Leu 
660 665 670 
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Gly Arg Trp Arg Glu Asp Ala He Asp Trp Gin Arg Thr Phe Ser Val 
675 680 685 

Gly Ala Val Asp Phe Glu Leu Leu Arg Ser Asp Trp Asn Asp Leu Arg 
690 695 700 

Cys Asn Val Ser Gly Asn Leu Gin Leu Pro Glu Ala Glu Ala Val Asp 
705 710 715 720 

Val Thr Ala Gin Tyr Met Glu Arg Leu Asn Ala Arg His Gly Gly Arg 
725 730 735 

Phe Ala Leu Leu Arg He Val Asn Val Glu Lys Arg Arg Asp Ser Ala 
740 745 750 

Arg Gly Ser Arg Phe Leu Leu Glu Leu Glu Leu Gin Glu Arg Gly Gly 
755 760 765 

Gly Arg Leu Arg Leu Ser Glu Tyr Val Phe Leu Arg Leu Pro Gly Ala 
770 775 780 

Arg Val Gly Asp Ala Asp Gly Glu Ser Pro Glu Pro Ala Pro Ala Ala 
785 790 795 800 

Ser Val Arg Pro Asp Gly Arg Pro Glu Leu Cys Arg Pro Leu Arg Leu 
805 810 815 

Ala Trp Arg Gin Asp Val Met Val His Phe He Val Pro Val Lys Asn 
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820 825 830 

Gin Ala Arg Trp Val Ala Gin Phe Leu Ala Asp Met Ala Ala Leu His 
835 840 845 

Ala Arg Thr Gly Asp Ser Arg Phe Ser Val Val Leu Val Asp Phe Glu 
850 855 860 

Ser Glu Asp Met Asp Val Glu Arg Ala Leu Arg Ala Ala Arg Leu Pro 
865 870 875 880 

Arg Tyr Gin Tyr Leu Arg Arg Thr Gly Asn Phe Glu Arg Ser Ala Gly 
885 890 895 

Leu Gin Ala Gly Val Asp Ala Val Glu Asp Ala Ser Ser lie Val Phe 
900 905 910 

Leu Cys Asp Leu His He His Phe Pro Pro Asn lie Leu Asp Gly He 
915 920 925 

Arg Lys His Cys Val Glu Gly Arg Leu Ala Phe Ala Pro Val Val Met 
930 935 940 

Arg Leu Ser Cys Gly Ser Ser Pro Arg Asp Pro His Gly Tyr Trp Glu 
945 950 955 960 

Val Asn Gly Phe Gly Leu Phe Gly He Tyr Lys Ser Asp Phe Asp Arg 
965 970 975 
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Val Gly Gly Met Asn Thr Glu Glu Phe Arg Asp Gin Trp Gly Gly Glu 
980 985 990 

Asp Trp Glu Leu Leu Asp Arg Val Leu Gin Ala Gly Leu Glu Val Glu 
995 1000 1005 

Arg Leu Arg Leu Arg Asn Phe Tyr His His Tyr His Ser Lys Arg Gly 
1010 1015 1020 

Met Trp Ser Val Arg Ser Arg Lys Gly Ser Arg Thr Gly Ala Ser 
1025 1030 1035 1039 



<210> 2 

<211> 3120 

<212> DNA 

<213> Homo sapiens 

<400> 2 

atg ccg egg etc ccg gtg aag aag ate cgt aag cag atg aag ctg ctg 48 

Met Pro Arg Leu Pro Val Lys Lys He Arg Lys Gin Met Lys Leu Leu 

15 10 15 

ctg ctg ctg ctg ctg ctg age tgc gec gcg tgg etc acc tac gtg cac 96 
Leu Leu Leu Leu Leu Leu Ser Cys Ala Ala Trp Leu Thr Tyr Val His 
20 25 30 

ctg ggc ctg gtg cgc cag gga cgc gcg ctg cgc cag cgc ctg ggc tac 144 
Leu Gly Leu Val Arg Gin Gly Arg Ala Leu Arg Gin Arg Leu Gly Tyr 
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35 40 45 

ggg cga gat ggt gag aag ctg acc agt gag acc gac ggc egg ggg gtc 192 
Gly Arg Asp Gly Glu Lys Leu Thr Ser Glu Thr Asp Gly Arg Gly Val 
50 55 60 

cac get gcg cca tec aca cag agg get gag gac tec agt gag age cgt 240 
His Ala Ala Pro Ser Thr Gin Arg Ala Glu Asp Ser Ser Glu Ser Arg 
65 70 75 80 

gaa gag gag caa gcg ccc gaa ggt egg gac eta gac atg ctg ttt cct 288 
Glu Glu Glu Gin Ala Pro Glu Gly Arg Asp Leu Asp Met Leu Phe Pro 
85 90 95 

ggg ggg get ggg agg ctg cca ctg aac ttc acc cat cag aca ccc cca 336 
Gly Gly Ala Gly Arg Leu Pro Leu Asn Phe Thr His Gin Thr Pro Pro 
100 105 110 

tgg egg gag gag tac aag ggg cag gtg aac ctg cac gtg ttt gag gac 384 
Trp Arg Glu Glu Tyr Lys Gly Gin Val Asn Leu His Val Phe Glu Asp 
115 120 125 

tgg tgt ggg ggc gec gtg ggc cac ctg agg agg aac ctg cac ttc ccg 432 
Trp Cys Gly Gly Ala Val Gly His Leu Arg Arg Asn Leu His Phe Pro 
130 135 140 

ctg ttc cct cat acg cgc acc acc gtg aag aag ttg gec gtg tec ccc 480 
Leu Phe Pro His Thr Arg Thr Thr Val Lys Lys Leu Ala Val Ser Pro 
145 150 155 160 
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aag tgg aag aac tat gga etc cgt att ttt ggt ttc ate cac ccg gcg 
Lys Trp Lys Asn Tyr Gly Leu Arg He Phe Gly Phe He His Pro Ala 
165 170 175 



528 



agg gac gga gac gtc cag ttt tct gtg gee tea gac gac aac teg gag 
Arg Asp Gly Asp Val Gin Phe Ser Val Ala Ser Asp Asp Asn Ser Glu 
180 185 190 



567 



ttc tgg ctg agt ctg gac gag age cct get get gee cag ctt gtg gee 
Phe Trp Leu Ser Leu Asp Glu Ser Pro Ala Ala Ala Gin Leu Val Ala 
195 200 205 



624 



ttt gtg ggc aag act ggc tec gag tgg aca gcg cct gga gaa ttc ace 
Phe Val Gly Lys Thr Gly Ser Glu Trp Thr Ala Pro Gly Glu Phe Thr 
210 215 220 



672 



aag ttc age tec cag gtg tec aag ccc agg egg etc atg gee tec egg 
Lys Phe Ser Ser Gin Val Ser Lys Pro Arg Arg Leu Met Ala Ser Arg 
225 230 235 240 



720 



agg tac tac ttt gag ttg ctg cac aag cag gac gac cgc ggc teg gac 
Arg Tyr Tyr Phe Glu Leu Leu His Lys Gin Asp Asp Arg Gly Ser Asp 
245 250 255 



768 



cac gtg gaa gtg ggc tgg cga get ttc ctg ccc ggc ctg aag ttc gag 
His Val Glu Val Gly Trp Arg Ala Phe Leu Pro Gly Leu Lys Phe Glu 
260 265 270 



816 
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gtc ate age tct get cac ate tec ctg tac aca gat gag tea gee ttg 864 
Val He Ser Ser Ala His He Ser Leu Tyr Thr Asp Glu Ser Ala Leu 
275 280 285 

aag atg gac cac gtg gcg cac gtc ccc cag tct cca gee age cac gtg 912 
Lys Met Asp His Val Ala His Val Pro Gin Ser Pro Ala Ser His Val 
290 295 300 

ggg ggg cgt ccg ccg cag gag gag ace age gca gac atg ctg egg cca 960 
Gly Gly Arg Pro Pro Gin Glu Glu Thr Ser Ala Asp Met Leu Arg Pro 
305 310 315 320 

gat ccc agg gat ace ttt ttc etc act cca cgc atg gaa tct teg age 1008 
Asp Pro Arg Asp Thr Phe Phe Leu Thr Pro Arg Met Glu Ser Ser Ser 
325 330 335 

ctg gag aac gtg ctg gag ccc tgc gec tac gec ccc acc tac gtg gtc 1056 
Leu Glu Asn Val Leu Glu Pro Cys Ala Tyr Ala Pro Thr Tyr Val Val 
340 345 350 

aag gac ttc ccg ate gee aga tac cag ggc ctg caa ttt gtg tac ctg 1104 
Lys Asp Phe Pro He Ala Arg Tyr Gin Gly Leu Gin Phe Val Tyr Leu 
355 360 365 

tec ttc gtt tat ccc aac gac tac act cgc etc acc cac atg gag acg 1152 
Ser Phe Val Tyr Pro Asn Asp Tyr Thr Arg Leu Thr His Met Glu Thr 
370 375 380 

gac aac aag tgc ttc tac cgc gag tct ccg ctg tat ctg gag agg ttt 1200 
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Asp Asn Lys Cys Phe Tyr Arg Glu Ser Pro Leu Tyr Leu Glu Arg Phe 
385 390 395 400 

ggg ttc tat aaa tac atg aag atg gac aag gag gag ggg gat gag gat 1248 
Gly Phe Tyr Lys Tyr Met Lys Met Asp Lys Glu Glu Gly Asp Glu Asp 
405 410 415 

gaa gaa gac gag gtg cag cgc cga gcc ttc etc ttc etc aac ccg gac 1296 
Glu Glu Asp Glu Val Gin Arg Arg Ala Phe Leu Phe Leu Asn Pro Asp 
420 425 430 

gac ttc ctg gac gac gag gac gag ggg gag ctg etc gac age ctg gag 1344 
Asp Phe Leu Asp Asp Glu Asp Glu Gly Glu Leu Leu Asp Ser Leu Glu 
435 440 445 

ccc acc gag gcg gcc ccg ccc agg age ggc ccc cag tec ccc gcc cca 1392 
Pro Thr Glu Ala Ala Pro Pro Arg Ser Gly Pro Gin Ser Pro Ala Pro 
450 455 460 

gca gcc ccc gcc cag ccc gga gcc acc etc gcc ccg ccg acc cct ccc 1440 
Ala Ala Pro Ala Gin Pro Gly Ala Thr Leu Ala Pro Pro Thr Pro Pro 
465 470 475 480 

cgc ccc egg gac ggg ggg acc ccc agg cac tec egg gcc ctg age tgg 1488 
Arg Pro Arg Asp Gly Gly Thr Pro Arg His Ser Arg Ala Leu Ser Trp 
485 490 495 

gcc gcc agg gcc gcc cgc cct ttg ccg etc ttc ttg ggc cga get ccg 1536 
Ala Ala Arg Ala Ala Arg Pro Leu Pro Leu Phe Leu Gly Arg Ala Pro 
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500 505 510 



ccc ccg cgc cct gca gtg gag cag ccg ccc cca aag gtg tac gtg acc 1584 

Pro Pro Arg Pro Ala Val Glu Gin Pro Pro Pro Lys Val Tyr Val Thr 

515 520 525 

agg gtg egg ccg gga cag egg gca tec ccc egg gee cca gcg ccg cgt 1632 

Arg Val Arg Pro Gly Gin Arg Ala Ser Pro Arg Ala Pro Ala Pro Arg 

530 535 540 

gcg ccc tgg ccg ccc ttc cct ggc gtc ttc ctg cac ccc agg cct ctg 1680 

Ala Pro Trp Pro Pro Phe Pro Gly Val Phe Leu His Pro Arg Pro Leu 

545 550 555 560 

ccc aga gtg cag ctg egg gcg ccc cca cgc cca ccc egg ccc cac ggc 1728 
Pro Arg Val Gin Leu Arg Ala Pro Pro Arg Pro Pro Arg Pro His Gly 

565 570 575 

cgc agg acc ggc ggc ccc cag gee aca cag ccg agg ccc cca gec egg 1776 

Arg Arg Thr Gly Gly Pro Gin Ala Thr Gin Pro Arg Pro Pro Ala Arg 

580 585 590 

gcg cag gee acc caa ggg ggc egg gag ggc cag gcg cgc acg ctg gga 1824 

Ala Gin Ala Thr Gin Gly Gly Arg Glu Gly Gin Ala Arg Thr Leu Gly 

595 600 605 

cct gcg gcg ccc aca gtg gac tea aac ttg tec tec gaa gcg egg ccc 1872 

Pro Ala Ala Pro Thr Val Asp Ser Asn Leu Ser Ser Glu Ala Arg Pro 

610 615 620 
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gtg acc tec ttc ctg age ttg tec cag gtg tec ggg ccg cag ctg ccc 
Val Thr Ser Phe Leu Ser Leu Ser Gin Val Ser Gly Pro Gin Leu Pro 
625 630 635 640 



1920 



ggg gag ggc gaa gag gag gag gaa ggg gag gac gat ggg gee ccg ggc 
Gly Glu Gly Glu Giu Glu Glu Glu Gly Glu Asp Asp Gly Ala Pro Gly 
645 650 655 



1968 



gac gag gec gcg teg gag gac age gag gag gee gcg ggc ccg gcg etc 
Asp Glu Ala Ala Ser Glu Asp Ser Glu Glu Ala Ala Gly Pro Ala Leu 
660 665 670 



2016 



gga cgc tgg cgt gag gac gee ate gac tgg cag cgc acg ttc age gtg 
Gly Arg Trp Arg Glu Asp Ala He Asp Trp Gin Arg Thr Phe Ser Val 
675 680 685 



2064 



ggc gec gtg gac ttc gag ctg ctg cgc teg gac tgg aac gac ctg cga 
Gly Ala Val Asp Phe Glu Leu Leu Arg Ser Asp Trp Asn Asp Leu Arg 
690 695 700 



2112 



tgc aac gtt teg ggg aac ctg cag ctg ccg gag gcg gag gee gtg gac 
Cys Asn Val Ser Gly Asn Leu Gin Leu Pro Glu Ala Glu Ala Val Asp 
705 710 715 720 



2160 



gtg acc get cag tac atg gag egg ctg aac gcg cgc cac ggc ggg cgc 
Val Thr Ala Gin Tyr Met Glu Arg Leu Asn Ala Arg His Gly Gly Arg 
725 730 • 735 
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ttc gcg ctt ctg cgc ate gtg aac gtg gag aag cgc egg gac teg gcg 2256 
Phe Ala Leu Leu Arg He Val Asn Val Glu Lys Arg Arg Asp Ser Ala 
740 745 750 

cga ggg agt cgc ttc ctg ctg gag ctg gag ctg cag gag cgc ggg ggc 2304 
Arg Gly Ser Arg Phe Leu Leu Glu Leu Glu Leu Gin Glu Arg Gly Gly 
755 760 765 

ggc cgc ctg cga ctg tec gag tac gtc ttc ctg egg ctg ccg gga gec 2352 
Gly Arg Leu Arg Leu Ser Glu Tyr Val Phe Leu Arg Leu Pro Gly Ala 
770 775 . 780 

cgc gta ggg gat gca gac gga gaa agt ccc gaa ccc get ccc gec gec 2400 
Arg Val Gly Asp Ala Asp Gly Glu Ser Pro Glu Pro Ala Pro Ala Ala 
785 790 795 800 

tec gtg cgc ccc gac ggc cgc ccc gag etc tgc egg cca ctg cgc ctg 2448 
Ser Val Arg Pro Asp Gly Arg Pro Glu Leu Cys Arg Pro Leu Arg Leu 
805 810 ' 815 

gec tgg cgc cag gac gtg atg gtt cac ttc ate gtg cca gtg aaa aac 2496 
Ala Trp Arg Gin Asp Val Met Val His Phe He Val Pro Val Lys Asn 
820 825 830 

> 

cag gca egg tgg gtg gca cag ttc ctg gcg gac atg get gcg ctg cac 2544 
Gin Ala Arg Trp Val Ala Gin Phe Leu Ala Asp Met Ala Ala Leu His 
835 840 845 

gcg cgc acc ggg gac teg cgt ttc age gtc gtc ctg gtg gat ttc gag 2592 
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Ala Arg Thr Gly Asp Ser Arg Phe Ser Val Val Leu Val Asp Phe Glu ' 
850 855 860 

age gag gat atg gac gtg gag egg gee ctg cgc gee gcg cgc ctg ccc 2640 
Ser Glu Asp Met Asp Val Glu Arg Ala Leu Arg Ala Ala Arg Leu Pro 
865 870 875 880 

egg tac cag tac ctg aga cga acc ggg aac ttc gag cgc tec gec ggg 2688 
Arg Tyr Gin Tyr Leu Arg Arg Thr Gly Asn Phe Glu Arg Ser Ala Gly 
885 890 895 

ctg cag gcg gga gtg gac gcg gta gag gac gec age age ate gtg ttc 2736 
Leu Gin Ala Gly Val Asp Ala Val Glu Asp Ala Ser Ser He Val Phe 
900 905 910 

etc tgc gac ctg cac ate cac ttc cca ccc aac ate ctg gac ggc ate 2784 
Leu Cys Asp Leu His He His Phe Pro Pro Asn lie Leu Asp Gly He 
915 920 925 

cgc aag cac tgc gtg gag ggc agg ctg gec ttc gcg ccc gtg gtc atg 2832 

■» 

Arg Lys His Cys Val Glu Gly Arg Leu Ala Phe Ala Pro Val Val Met 
930 935 940 

cgc ctg age tgc ggg age teg ccc egg gac ccc cac ggt tac tgg gag 2880 
Arg Leu Ser Cys Gly Ser Ser Pro Arg Asp Pro His Gly Tyr Trp Glu 
945 950 955 960 

gtg aac ggc ttt ggc ctt ttt ggg ate tac aag teg gac ttt gac egg 2928 
Val Asn Gly Phe Gly Leu Phe Gly He Tyr Lys Ser Asp Phe Asp Arg 
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965 970 975 

gtt gga gga atg aac acg gag gag ttc cga gac cag tgg ggg ggt gaa 2976 
Val Gly Gly Met Asn Thr Glu Glu Phe Arg Asp Gin Trp Gly Gly Glu 
980 985 990 

gac tgg gag etc ctg gac agg gtc ctg cag gca ggg ctg gag gtg gag 3024 
Asp Trp Glu Leu Leu Asp Arg Val Leu Gin Ala Gly Leu Glu Val Glu 
995 1000 1005 

egg etc cga ctg egg aat ttc tat cac cac tac cac tec aag agg ggc 3072 
Arg Leu Arg Leu Arg Asn Phe Tyr His His Tyr His Ser Lys Arg Gly 
1010 1015 1020 

atg tgg age gtc cgc age agg aag ggc tct cgc acg ggg gcg tct tga 3120 
Met Trp Ser Val Arg Ser Arg Lys Gly Ser Arg Thr Gly Ala Ser 
1025 1030 1035 1039 



<210> 3 

<211> 998 

<212> PRT 

<213> Homo sapiens 

<400> 3 

Met Gly Ser Pro Arg Ala Ala Arg Pro Pro Leu Leu Leu Arg Pro Val 
15 10 15 

Lys Leu Leu Arg Arg Arg Phe Arg Leu Leu Leu Ala Leu Ala Val Val 
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20 25 30 

Ser Val Gly Leu Trp Thr Leu Tyr Leu Glu Leu Val Ala Ser Ala Gin 
35 40 45 

Val Gly Gly Asn Pro Leu Asn Arg Arg Tyr Gly Ser Trp Arg Glu Leu 
50 55 60 

Ala Lys Ala Leu Ala Ser Arg Asn He Pro Ala Val Asp Pro His Leu 
65 70 75 80 

Gin Phe Tyr His Pro Gin Arg Leu Ser Leu Glu Asp His Asp He Asp 
85 90 95 

Gin Gly Val Ser Ser Asn Ser Ser Tyr Leu Lys Trp Asn Lys Pro Val 
100 105 110 

Pro Trp Leu Ser Glu Phe Arg Gly Arg Ala Asn Leu His Val Phe Glu 
115 120 125 

Asp Trp Cys Gly Ser Ser He Gin Gin Leu Arg Arg Asn Leu His Phe 
130 135 140 

Pro Leu Tyr Pro His He Arg Thr Thr Leu Arg Lys Leu Ala Val Ser 
145 150 155 160 

Pro Lys Trp Thr Asn Tyr Gly Leu Arg He Phe Gly Tyr Leu His Pro 
165 170 175 
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Phe Thr Asp Gly Lys He Gin Phe Ala He Ala Ala Asp Asp Asn Ala 
180 185 190 

Glu Phe Trp Leu Ser Leu Asp Asp Gin Val Ser Gly Leu Gin Leu Leu 
195 200 205 

Ala Ser Val Gly Lys Thr Gly Lys Glu Trp Thr Ala Pro Gly Glu Phe 
210 215 220 

Gly Lys Phe Arg Ser Gin He Ser Lys Pro Val Ser Leu Ser Ala Ser 
225 230 235 240 

His Arg Tyr Tyr Phe Glu Val Leu His Lys Gin Asn Glu Glu Gly Thr 
245 250 255 

Asp His Val Glu Val Ala Trp Arg Arg Asn Asp Pro Gly Ala Lys Phe 
260 265 270 

Thr He He Asp Ser Leu Ser Leu Ser Leu Phe Thr Asn Glu Thr Phe 
275 280 285 

Leu Gin Met Asp Glu Val Gly His He Pro Gin Thr Ala Ala Ser His 
290 295 300 

Val Asp Ser Ser Asn Ala Leu Pro Arg Asp Glu Gin Pro Pro Ala Asp 
305 310 315 320 

Met Leu Arg Pro Asp Pro Arg Asp Thr Leu Tyr Arg Val Pro Leu He 
325 330 335 
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Pro Lys Ser His Leu Arg His Val Leu Prp Asp Cys Pro Tyr Lys Pro 
340 345 350 

Ser Tyr Leu Val Asp Gly Leu Pro Leu Gin Arg Tyr Gin Gly Leu Arg 
355 360 365 

Phe Val His Leu Ser Phe Val Tyr Pro Asn Asp Tyr Thr Arg Leu Ser 
370 375 380 

His Met Glu Thr His Asn Lys Cys Phe Tyr Gin Glu Asn Ala Tyr Tyr 
385 390 395 400 

Gin Asp Arg Phe Ser Phe Gin Glu Tyr He Arg He Asp Gin Pro Glu 
405 410 415 

Lys Gin Gly Leu Glu Gin Pro Gly Phe Glu Glu Asn Leu Leu Glu Glu 
420 425 430 

Ser Gin Tyr Gly Glu Val Ala Glu Glu Thr Pro Ala Ser Asn Asn Gin 
435 440 445 

Asn Ala Arg Met Leu Glu Gly Arg Gin Thr Pro Ala Ser Thr Leu Glu 
450 455 460 

Gin Asp Ala Thr Asp Tyr Arg Leu Arg Ser Leu Arg Lys Leu Leu Ala 
465 470 475 480 



Gin Pro Arg Glu Gly Leu Leu Ala Pro Phe Ser Lys Arg Asn Ser Thr 
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485 490 495 

Ala Ser Phe Pro Gly Arg Thr Ser His He Pro Val Gin Gin Pro Glu 
500 505 510 

Lys Arg Lys Gin Lys Pro Ser Pro Glu Pro Ser Gin Asp Ser Pro His 
515 520 525 

Ser Asp Lys Trp Pro Pro Gly His Pro Val Lys Asn Leu Pro Gin Met 
530 535 540 

Arg Gly Pro Arg Pro Arg Pro Ala Gly Asp Ser Pro Arg Lys Thr Gin 
545 550 555 560 

Trp Leu Asn Gin Val Glu Ser Tyr lie Ala Glu Gin Arg Arg Gly Asp 
565 570 575 

Arg Met Arg Pro Gin Ala Pro Gly Arg Gly Trp His Gly Glu Glu Glu 
580 585 590 

Val Val Ala Ala Ala Gly Gin Glu Gly Gin Val Glu Gly Glu Glu Glu 
595 600 605 

Gly Glu Glu Glu Glu Glu Glu Glu Asp Met Ser Glu Val Phe Glu Tyr 
610 615 620 

Val Pro Val Phe Asp Pro Val Val Asn Trp Asp Gin Thr Phe Ser Ala 
625 630 635 640 
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Arg Asn Leu Asp Phe Gin Ala Leu Arg Thr Asp Trp He Asp Leu Ser 
645 650 655 

Cys Asn Thr Ser Gly Asn Leu Leu Leu Pro Glu Gin Glu Ala Leu Glu 
660 665 670 

Val Thr Arg Val Phe Leu Lys Lys Leu Asn Gin Arg Ser Arg Gly Arg 
675 680 685 

Tyr Gin Leu Gin Arg He Val Asn Val Glu Lys Arg Gin Asp Gin Leu 
690 695 700 

Arg Gly Gly Arg Tyr Leu Leu Glu Leu Glu Leu Leu Glu Gin Gly Gin 
705 710 715 720 

Arg Val Val Arg Leu Ser Glu Tyr Val Ser Ala Arg Gly Trp Gin Gly 
725 730 735 

He Asp Pro Ala Gly Gly Glu Glu Val Glu Ala Arg Asn Leu Gin Gly 
740 745 750 

Leu Val Trp Asp Pro His Asn Arg Arg Arg Gin Val Leu Asn Thr Arg 
755 760 765 

Ala Gin Glu Pro Lys Leu Cys Trp Pro Gin Gly Phe Ser Trp Ser His 
770 775 780 

Arg Ala Val Val His Phe Val Val Pro Val Lys Asn Gin Ala Arg Trp 
785 790 795 800 
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Val Gin Gin Phe He Lys Asp Met Glu Asn Leu Phe Gin Val Thr Gly 
805 810 815 

Asp Pro His Phe Asn He Val He Thr Asp Tyr Ser Ser Glu Asp Met 
820 825 830 

Asp Val Glu Met Ala Leu Lys Arg Ser Lys Leu Arg Ser Tyr Gin Tyr 
835 840 845 

Val Lys Leu Ser Gly Asn Phe Glu Arg Ser Ala Gly Leu Gin Ala Gly 
850 855 860 

He Asp Leu Val Lys Asp Pro His Ser He He Phe Leu Cys Asp Leu 
865 870 875 880 

His He His Phe Pro Ala Gly Val He Asp Ala He Arg Lys His Cys 
885 890 895 

Val Glu Gly Lys Met Ala Phe Ala Pro Met Val Met Arg Leu His Cys 
900 905 910 

Gly Ala Thr Pro Gin Trp Pro Glu Gly Tyr Trp Glu Val Asn Gly Phe 
915 920 925 

Gly Leu Leu Gly He Tyr Lys Ser Asp Leu Asp Arg He Gly Gly Met 
930 935 940 

Asn Thr Lys Glu Phe Arg Asp Arg Trp Gly Gly Glu Asp Trp Glu Leu 
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945 950 955 960 

Leu Asp Arg He Leu Gin Ala Gly Leu Asp Vai Glu Arg Leu Ser Leu 
965 970 975 

Arg Asn Phe Phe His His Phe His Ser Lys Arg Gly Met Trp Ser Arg 
980 985 990 

Arg Gin Met Lys Thr Leu 
995 998 



<210> 4 

<211> 2997 

<212> DNA 

<213> Homo sapiens 

<400> 4 

atg ggg age ccc egg gec gcg egg ccc ccg ctg etc ctg cgc ccg gtg 48 
Met Gly Ser Pro Arg Ala Ala Arg Pro Pro Leu Leu Leu Arg Pro Val 
15 10 15 

aag ctg ctg egg agg cgc ttc egg ctg ctg ctg gcg etc gee gtg gtg 96 
Lys Leu Leu Arg Arg Arg Phe Arg Leu Leu Leu Ala Leu Ala Val Val 
20 25 30 

tct gtg ggg etc tgg act ctg tat ctg gaa ctg gtg gcg teg gee cag 144 
Ser Val Gly Leu Trp Thr Leu Tyr Leu Glu Leu Val Ala Ser Ala Gin 
35 40 45 
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gtc ggc ggg aac ccc ctg aac egg agg tac ggc age tgg aga gaa eta 192 
Val Gly Gly Asn Pro Leu Asn Arg Arg Tyr Gly Ser Trp Arg Glu Leu 
50 55 60 

gec aag get ctg gee age agg aac att cca get gtg gat cca cac etc 240 
Ala Lys Ala Leu Ala Ser Arg Asn He Pro Ala Val Asp Pro His Leu 
65 70 75 80 

cag ttc tac cat ccc cag agg ctg age etc gag gac cac gac att gac 288 
Gin Phe Tyr His Pro Gin Arg Leu Ser Leu Glu Asp His Asp He Asp 
85 90 95 

caa ggg gtg age agt aac age age tac ttg aag tgg aac aag cct gtc 336 
Gin Gly Val Ser Ser Asn Ser Ser Tyr Leu Lys Trp Asn Lys Pro Val 
100 105 110 

ccc tgg etc tea gag ttc egg ggc cgt gee aac ctg cat gtg ttt gaa 384 
Pro Trp Leu Ser Glu Phe Arg Gly Arg Ala Asn Leu His Val Phe Glu 
115 120 125 

gac tgg tgt ggc age tct ate cag cag etc agg agg aac ctg cat ttc 432 
Asp Trp Cys Gly Ser Ser He Gin Gin Leu Arg Arg Asn Leu His Phe 
130 135 140 

cca ctg tac ccc cat att cgc aca acc ctg agg aag ctt get gtg tec 480 
Pro Leu Tyr Pro His lie Arg Thr Thr Leu Arg Lys Leu Ala Val Ser 
145 150 ' 155 160 
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ccc aaa tgg acc aac tat ggc etc cgc ate ttt ggc tac ctg cac ccc 528 
Pro Lys Trp Thr Asn Tyr Gly Leu Arg He Phe Gly Tyr Leu His Pro 
165 170 175 

ttt act gat ggg aaa ate cag ttt gec att get gca gat gac aac gcg 576 
Phe Thr Asp Gly Lys He Gin Phe Ala He Ala Ala Asp Asp Asn Ala 
180 185 190 

gag ttc tgg ctg age etc gat gac cag gtc tea ggc etc cag ctg ctg 624 
Glu Phe Trp Leu Ser Leu Asp Asp Gin Val Ser Gly Leu Gin Leu Leu 
195 200 205 

gec agt gtg ggc aag act gga aag gag tgg acc gec ccg gga gag ttt 672 
Ala Ser Val Gly Lys Thr Gly Lys Glu Trp Thr Ala Pro Gly Glu Phe 
210 215 220 

ggg aaa ttt egg age caa att tec aag ccg gtg age ctg tea gec tec 720 
Gly Lys Phe Arg Ser Gin He Ser Lys Pro Val Ser Leu Ser Ala Ser 
225 230 235 240 

cac agg tac tac ttc gag gtg ctg cac aag cag aat gag gag ggc acc 768 
His Arg Tyr Tyr Phe Glu Val Leu His Lys Gin Asn Glu Glu Gly Thr 
245 250 255 

gac cac gtg gaa gtt gca tgg cga egg aac gac cct gga gee aag ttc 816 
Asp His Val Glu Val Ala Trp Arg Arg Asn Asp Pro Gly Ala Lys Phe 
260 265 270 

acc ate att gac tec etc tec ctg tec etc ttc aca aat gag acg ttc 864 
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Thr He He Asp Ser Leu Ser Leu Ser Leu Phe Thr Asn Glu Thr Phe 
275 280 285 

eta cag atg gat gag gtg ggc cac ate cca cag aca gca gec age cac 912 
Leu Gin Met Asp Glu Val Gly His He Pro Gin Thr Ala Ala Ser His 
290 295 300 

gtg gac tec tec aac get ctt ccc agg gat gag cag ccg ccc get gac 960 
Val Asp Ser Ser Asn Ala Leu Pro Arg Asp Glu Gin Pro Pro Ala Asp 
305 310 315 320 

atg ctt egg cct gac ccc egg gac acc etc tat cga gtg cct ctg ate 1008 
Met Leu Arg Pro Asp Pro Arg Asp Thr Leu Tyr Arg Val Pro Leu He 
325 330 335 

ccc aag teg cat etc cgc cac gtc ctg cct gac tgt ccc tac aaa ccc 1056 
Pro Lys Ser His Leu Arg His Val Leu Pro Asp Cys Pro Tyr Lys Pro 
340 345 350 

age tat ctg gtg gat ggg ctt cct ctg cag cgc tac cag gga etc egg 1104 
Ser Tyr Leu Val Asp Gly Leu Pro Leu Gin Arg Tyr Gin Gly Leu Arg 
355 360 365 

ttt gtt cat ctg tct ttt gtt tac ccc aat gac tat acc cgc ctg age 1152 
Phe Val His Leu Ser Phe Val Tyr Pro Asn Asp Tyr Thr Arg Leu Ser 
370 375 380 

cac atg gag acc cac aat aaa tgt ttc tac cag gaa aac gec tac tac 1200 
His Met Glu Thr His Asn Lys Cys Phe Tyr Gin Glu Asn Ala Tyr Tyr 
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385 390 395 400 

caa gac egg ttc age ttt cag gag tac ate agg att gac cag cct gag 1248 

Gin Asp Arg Phe Ser Phe Gin Glu Tyr He Arg He Asp Gin Pro Glu 
405 410 415 



gcg tec ttc cca ggg agg ace age cac att cca gtg cag cag cca gag 
Ala Ser Phe Pro Gly Arg Thr Ser His He Pro Val Gin Gin Pro Glu 
500 505 510 



1296 



aag cag ggg ctg gag cag cca ggt ttt gag gaa aac ctt eta gaa gag 
Lys Gin Gly Leu Glu Gin Pro Gly Phe Glu Glu Asn Leu Leu Glu Glu 
420 425 430 



tec cag tat ggg gaa gtg gca gag gag acc cct gec tec aac aac cag 1344 
Ser Gin Tyr Gly Glu Val Ala Glu Glu Thr Pro Ala Ser Asn Asn Gin 
435 440 445 

aat gec agg atg ctt gag gga aga cag aca cct gec tec acc ctg gag 1392 
Asn Ala Arg Met Leu Glu Gly Arg Gin Thr Pro Ala Ser Thr Leu Glu 
450 455 460 

caa gat gec act gac tac cgc etc cga age ctg egg aaa'ctc ctg get 1440 
Gin Asp Ala Thr Asp Tyr Arg Leu Arg Ser Leu Arg Lys Leu Leu Ala 
465 470 475 480 

cag ccc egg gag ggc ctg ctg gec ccc ttc tec aag egg aac tec aca 1488 
Gin Pro Arg Glu Gly Leu Leu Ala Pro Phe Ser Lys Arg Asn Ser Thr 
485 490 495 
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aag agg aag caa aaa ccc age cct gag ccc age caa gat tea cct cat 1584 
Lys Arg Lys Gin Lys Pro Ser Pro Glu Pro Ser Gin Asp Ser Pro His 
515 520 525 

tec gac aag tgg cct cct ggg cac cct gtg aag aac ctg cct cag atg 1632 
Ser Asp Lys Trp Pro Pro Gly His Pro Val Lys Asn Leu Pro Gin Met 
530 535 540 

agg ggg ccc agg ccc agg ccc get ggt gac age ccc agg aag act cag 1680 
Arg Gly Pro Arg Pro Arg Pro Ala Gly Asp Ser Pro Arg Lys Thr Gin 
545 550 555 560 

tgg ctg aac cag gtg gag teg tac ate gca gag cag aga egg ggt gac 1728 
Trp Leu Asn Gin Val Glu Ser Tyr He Ala Glu Gin Arg Arg Gly Asp 
565 570 575 

agg atg egg cct cag gec ccc gga agg ggc tgg cat ggg gag gag gaa 1776 
Arg Met Arg Pro Gin Ala Pro Gly Arg Gly Trp His Gly Glu Glu Glu 
580 585 590 

gtg gtg gcg gec gca ggc cag gaa gga caa gtg gag gga gag gaa gag 1824 
Val Val Ala Ala Ala Gly Gin Glu Gly Gin Val Glu Gly Glu Glu Glu 
595 600 605 

ggg gaa gaa gag gag gag gaa gag gat atg agt gag gtg ttc gag tac 1872 
Gly Glu Glu Glu Glu Glu Glu Glu Asp Met Ser Glu Val Phe Glu Tyr 
610 615 620 
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gta cct gtg ttt gac ccg gta gta aac tgg gac cag acc ttc agt gcc 1920 
Val Pro Val Phe Asp Pro Val Val Asn Trp Asp Gin Thr Phe Ser Ala 
625 630 635 640 

egg aat etc gac ttc caa gcc ctg agg act gac tgg ate gat ctg age 1968 
Arg Asn Leu Asp Phe Gin Ala Leu Arg Thr Asp Trp He Asp Leu Ser 
645 650 655 

tgt aac aca tct ggc aac ctg ctg ctt cca gag cag gaa get ctg gag 2016 
Cys Asn Thr Ser Gly Asn Leu Leu Leu Pro Glu Gin Glu Ala Leu Glu 
660 665 670 

gtc acg cga gtc ttc ttg aag aag etc aac cag agg age egg ggg agg 2064 
Val Thr Arg Val Phe Leu Lys Lys Leu Asn Gin Arg Ser Arg Gly Arg 
675 680 685 

tac cag eta cag cgc att gtg aac gtg gaa aag cgt cag gac cag eta 2112 
Tyr Gin Leu Gin Arg He Val Asn Val Glu Lys Arg Gin Asp Gin Leu 
690 695 700 

cgt ggg ggt cgc tac etc ctg gag ctt gaa ctg ttg gaa caa ggc cag 2160 
Arg Gly Gly Arg Tyr Leu Leu Glu Leu Glu Leu Leu Glu Gin Gly Gin 
705 710 715 720 

cgc gtg gtg egg etc teg gag tat gtg tct gca cga ggc tgg cag ggc 2208 
Arg Val Val Arg Leu Ser Glu Tyr Val Ser Ala Arg Gly Trp Gin Gly 
725 730 735 

ate gat cca get ggt ggg gag gag gtc gag gcc egg aac ctg caa ggc 2256 
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He Asp Pro Ala Gly Gly Glu Glu Val Glu Ala Arg Asn Leu Gin Gly 
740 745 750 

ctg gtc tgg gac cca cac aac cgt agg aga cag gtc ctg aat acc egg 2304 
Leu Val Trp Asp Pro His Asn Arg Arg Arg Gin Val Leu Asn Thr Arg 
755 760 765 

gec caa gag ccc aag ctg tgc tgg cct cag ggt ttc tec tgg agt cac 2352 
Ala Gin Glu Pro Lys Leu Cys Trp Pro Gin Gly Phe Ser Trp Ser His 
770 775 780 

cga gee gtg gtc cac ttc gtc gtg cct gtg aag aac cag gca cgc tgg 2400 
Arg Ala Val Val His Phe Val Val Pro Val Lys Asn Gin Ala Arg Trp 
785 790 795 800 

3 

gta cag caa ttc ate aaa gac atg gaa aac ctg ttc cag gtc acc ggt 2448 

Val Gin Gin Phe He Lys Asp Met Glu Asn Leu Phe Gin Val Thr Gly 

805 810 815 

gac cca cac ttc aac ate gtc ate act gac tat age agt gag gac atg 2496 
Asp Pro His Phe Asn lie Val He Thr Asp Tyr Ser Ser Glu Asp Met 
820 825 830 

gat gtt gag atg gca ctg aag agg tec aag ctg egg age tac cag tac 2544 
Asp Val Glu Met Ala Leu Lys Arg Ser Lys Leu Arg Ser Tyr Gin Tyr 
835 840 845 

gtg aag eta agt gga aac ttt gaa cgc tea get gga ctt cag get ggc 2592 
Val Lys Leu Ser Gly Asn Phe Glu Arg Ser Ala Gly Leu Gin Ala Gly 
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ata gac etc gtg aag gac ccg cac age ate ate ttc etc tgt gac etc 2640 
lie Asp Leu Val Lys Asp Pro His Ser lie He Phe Leu Cys Asp Leu 
865 870 875 880 

cac ate cac ttc cca get gga gtc ate gat gec att egg aag cac tgt 2688 
His He His Phe Pro Ala Gly Val He Asp Ala He Arg Lys His Cys 
885 890 895 

gtg gag gga aag atg gec ttt gee ccc atg gtg atg agg ctg cat tgt 2736 
Val Glu Gly Lys Met Ala Phe Ala Pro Met Val Met Arg Leu His Cys 
900 905 910 



ggg gec acc ccc cag tgg cct gag ggc tac tgg gag gtg aat ggg ttc 
Gly Ala Thr Pro Gin Trp Pro Glu Gly Tyr Trp Glu Val Asn Gly Phe 
915 920 925 



2784 



ggg ctg ctt ggc ate tac aag tct gac ctg gac agg att ggg ggc atg 
Gly Leu Leu Gly He Tyr Lys Ser Asp Leu Asp Arg He Gly Gly Met 
930 935 940 



2832 



aac acc aag gag ttc cga gac cgc tgg ggc ggg gaa gac tgg gag ctg 
Asn Thr Lys Glu Phe Arg Asp Arg Trp Gly Gly Glu Asp Trp Glu Leu 
945 950 955 960 



2880 



ctg gac agg ata etc caa gcg ggc ctg gac gtg gag cgt etc tec etc 
Leu Asp Arg He Leu Gin Ala Gly Leu Asp Val Glu Arg Leu Ser Leu 
965 970 975 



2928 
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agg aat ttc ttc cat cat ttc cat tec aag cga ggc atg tgg age cgt 2976 
Arg Asn Phe Phe His His Phe His Ser Lys Arg Gly Met Trp Ser Arg 
980 985 990 



cgc cag atg aag acg ctg tag 
Arg Gin Met Lys Thr Leu 
995 998 



2997 



<210> 5 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-Tl cDNA 

<400> 5 

gctcctgcag ctccagctcc a 21 



<210> 6 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-Tl cDNA 
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<400> 6 

aagcgactcc ctcgcgccga gt 22 

<210> 7 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-Tl cDNA 

<400> 7 

atgccgcggc tcccggtgaa gaag 24 

<210> 8 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 

<400> 8 

ccacagttca agctccagga ggta 24 
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<210> 9 

<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 
<400> 9 

ctgacgcttt tccacgttca caat 24 



<210> 10 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 

<400> 10 

caccccgtct ctgctctgcg at 22 



<210> 11 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 
<400> 11 

gtcttcctgg ggctgtcacc a 21 

<210> 12 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 

<400> 12 

cacctcatcc atctgtagga acgt 24 

<210> 13 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 

<400> 13 

ctgtcgcct gcaacttcca cgt 23 
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<210> 14 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 

<400> 14 

aatgtcgtgg tcctcgaggc tea 23 

<210> 15 

<2U> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 
<400> 15 

gatggtagaa ctggaggtgt ggat 24 

<210> 16 
<211> 31 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-Tl cDNA 
<400> 16 

cccaagcttc ggggggtcca cgctgcgcca t 31 

<210> 17 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-Tl cDNA 

<400> 17 

gctctagact caagacgccc ccgtgcgaga 30 

<210> 18 

<2U> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 
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<400> 18 

ggaattcgag gtacggcagc tggagagaa 29 



<210> 19 

<211> 32 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for cloning GalNAc-T2 cDNA 

<400> 19 

acgcgtcgac ctacagcgtc ttcatctggc ga 



<210> 20 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying GalNAc-Tl cDNA 
<400> 20 

ctggtggatt tcgagagcga 20 
<210> 21 
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<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying GalNAc-Tl cDNA 

<400> 21 

tgccgtccag gatgttgg 18 

<210> 22 

<211> 15 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide probe used in PCR for detecting GalNAc-Tl cDNA 

<400> 22 

gcggtagagg acgcc 15 

<210> 23 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Oligonucleotide primer used in PCR for amplifying GalNAc-T2 cDNA 
<400> 23 

atcgtcatca ctgactatag cagtga 26 



<210> 24 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying GalNAc-T2 cDNA 
<400> 24 

gaatggcatc gatgactcca g 21 

<210> 25 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide probe used in PCR for detecting GalNAc-T2 cDNA 

<400> 25 

ctcgtgaagg acccgca 17 
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<210> 26 

<211> 1034 

<212> PRT 

<213> Mouse 

<400> 26 

Met Pro Trp Phe Pro Val Lys Lys Val Arg Lys Gin Met Lys Leu Leu 
15 10 15 

Leu Leu Leu Leu Leu Leu Thr Cys Ala Ala Trp Leu Thr Tyr Val His 
20 25 30 

Arg Ser Leu Val Arg Pro Gly Arg Ala Leu Arg Gin Arg Leu Gly Tyr 
35 40 45 

Gly Arg Asp Gly Glu Lys Leu Thr Gly Val Thr Asp Ser Arg Gly Val 
50 55 60 

Arg Val Pro Ser Ser Thr Gin Arg Ser Glu Asp Ser Ser Glu Ser His 
65 70 75 80 

Glu Glu Glu Gin Ala Pro Glu Gly Arg Gly Pro Asn Met Leu Phe Pro 
85 90 95 

Gly Gly Pro Arg Lys Pro Pro Pro Leu Asn Leu Thr His Gin Thr Pro 
100 105 110 

Pro Trp Arg Glu Glu Phe Lys Gly Gin Val Asn Leu His Val Phe Glu 
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115 120 125 



Asp Trp Cys Gly Gly Ala Val Gly His Leu Arg Arg Asn Leu His Phe 
130 135 140 

Pro Leu Phe Pro His Thr Arg Thr Thr Val Thr Lys Leu Ala Val Ser 
145 150 155 160 

Pro Lys Trp Lys Asn Tyr Gly Leu Arg He Phe Gly Phe lie His Pro 
165 170 175 

Ala Arg Asp Gly Asp He Gin Phe Ser Val Ala Ser Asp Asp Asn Ser 
180 185 190 

Glu Phe Trp Leu Ser Leu Asp Glu Ser Pro Ala Ala Ala Gin Leu Val 
195 200 205 

Ala Phe Val Gly Lys Thr Gly Ser Glu Trp Thr Ala Pro Gly Glu Phe 
210 215 220 

Thr Lys Phe Ser Ser Gin Val Ser Lys Pro Arg Arg Leu Met Ala Ser 
225 230 235 240 

Arg Arg Tyr Tyr Phe Glu Leu Leu His Lys Gin Asp Asp Lys Gly Ser 
245 250 255 

Asp His Val Glu Val Gly Trp Arg Ala Phe Leu Pro Gly Leu Lys Phe 
260 265 270 
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Glu He He Asp Ser Ala His lie Ser Leu Tyr Thr Asp Glu Ser Ser 
275 280 285 

Leu Lys Met Asp His Val Ala His Val Pro Gin Ser Pro Ala Ser His 
290 195 300 

He Gly Gly Phe Pro Pro Gin Gly Glu Pro Ser Ala Asp Met Leu His 
305 310 315 320 

Pro Asp Pro Arg Asp Thr Phe Phe Leu Thr Pro Arg Met Glu Pro Leu 
325 330 335 

Ser Leu Glu Asn Val Leu Glu Pro Cys Ala Tyr Ala Pro Thr Tyr lie 
340 345 350 

Leu Lys Asp Phe Pro He Ala Arg Tyr Gin Gly Leu Gin Phe Val Tyr 
355 360 365 

Leu Ser Phe He Tyr Pro Asn Asp His Thr Arg Leu Thr His Met Glu 
370 375 380 

Thr Asp Asn Lys Cys Phe Tyr Arg Glu Ser Pro Leu Tyr Leu Glu Arg 
385 390 395 400 

Phe Gly Phe Tyr Lys Tyr Met Lys Met Asp Lys Glu Glu Gly Glu Glu 
405 410 415 



Asp Glu Glu Glu Glu Val Gin Arg Arg Ala Phe Leu Phe Leu Asn Pro 
420 425 430 
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Asp Asp Phe Leu Asp Glu Glu Asp Glu Gin Asp Leu Leu Asp Ser Leu 
435 440 445 

Glu Pro Thr Asp Ala Ser Val Gin Gin Ser His Arg Thr Pro Thr Pro 
450 455 460 

Ala Ala Ser Thr Gly Thr Thr Ala Ser Pro Thr Pro Pro Thr Thr Ser 
465 470 475 480 

Pro Leu Asp Glu Gin Thr Leu Arg His Ser Arg Ala Leu Asn Trp Ala 
485 490 495 

Pro Arg Pro Leu Pro Leu Phe Leu Gly Arg Ala Pro Pro Pro Arg Thr 
500 505 510 

Val Glu Lys Ser Pro Ser Lys Val Tyr Val Thr Arg Val Arg Pro Gly 
515 520 525 

Gin Arg Ala Ser Pro Arg Ala Leu Arg Asp Ser Pro Trp Pro Pro Phe 
530 535 540 

Pro Gly Val Phe Leu Arg Pro Lys Pro Leu Pro Arg Val Gin Leu Arg 
545 550 555 560 

Val Pro Pro His Pro Pro Arg Thr Gin Gly Tyr Arg Thr Ser Gly Pro 
565 570 575 

Lys Val Thr Glu Leu Lys Pro Pro Val Arg Ala Gin Thr Ser Gin Gly 



45/75 



WO 2004/016790 




•CT/JP2003/010309 



580 585 590 

Gly Arg Glu Gly Gin Leu His Gly Gin Gly Leu Met Val Pro Thr Val 
595 600 605 

Asp Leu Asn Ser Ser Val Glu Thr Gin Pro Val Thr Ser Phe Leu Ser 
610 615 620 

Leu Ser Gin Val Ser Arg Pro Gin Leu Pro Gly Glu Gly Glu Glu Gly 
625 630 635 640 

Glu Glu Asp Gly Ala Pro Gly Asp Glu Ala Thr Ser Glu Asp Ser Glu 
645 650 655 

Glu Glu Glu Glu Pro Ala Ala Gly Arg Pro Leu Gly Arg Trp Arg Glu 
660 665 670 

Asp Ala He Asn Trp Gin Arg Thr Phe Ser Val Gly Ala Met Asp Phe 
675 680 685 

Glu Leu Leu Arg Ser Asp Trp Asn Asp Leu Arg Cys Asn Val Ser Gly 
690 695 700 

Asn Leu Gin Leu Pro Glu Ala Glu Ala Val Asp Val Val Ala Gin Tyr 
705 710 715 720 

Met Glu Arg Leu Asn Ala Lys His Gly Gly Arg Phe Ser Leu Leu Arg 
725 730 735 
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He Val Asn Val Glu Lys Arg Arg Asp Ser Ala Arg Gly Ser Arg Phe 
740 745 750 

Leu Leu Glu Leu Glu Leu Gin Glu Arg Gly Gly Ser Arg Gin Arg Leu 
755 760 765 

Ser Glu Tyr Val Phe Leu Arg Leu Pro Gly Ala Arg Val Gly Asp Glu 
770 775 780 

Asp Gly Glu Ser Pro Glu Pro Pro Pro Ala Ala Ser lie His Pro Asp 
785 790 795 800 

Ser Arg Pro Glu Leu Cys Arg Pro Leu His Leu Ala Trp Arg Gin Asp 
805 810 815 

Val Met Val His Phe He Val Pro Val Lys Asn Gin Ala Arg Trp Val 
820 825 830 

Val Gin Phe Leu Ala Asp Met Thr Ala Leu His Val His Thr Gly Asp 
835 840 845 

Ser Tyr Phe Asn He lie Leu Val Asp Phe Glu Ser Glu Asp Met Asp 
850 855 860 

Val Glu Arg Ala Leu Arg Ala Ala Gin Leu Pro Arg Tyr Gin Tyr Leu 
865 870 875 880 

Lys Arg Thr Gly Asn Phe Glu Arg Ser Ala Gly Leu Gin Thr Gly Val 
885 890 895 
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Asp Ala Yal Glu Asp Pro Ser Ser He Val Phe Leu Cys Asp Leu His 
900 905 910 

He His Phe Pro Pro Asn He Leu Asp Ser He Arg Lys His Cys Val 
915 920 925 

Glu Gly Lys Leu Ala Phe Ala Pro Val Val Met Arg Leu Gly Cys Gly 
930 935 940 

Ser Ser Pro Trp Asp Pro His Gly Tyr Trp Glu Val Asn Gly Phe Gly 
945 950 955 960 

Leu Phe Gly He Tyr Lys Ser Asp Phe Asp Arg Val Gly Gly Met Asn 
965 970 975 

Thr Glu Glu Phe Arg Asp Gin Trp Gly Gly Glu Asp Trp Glu Leu Leu 
980 985 990 

Asp Arg Val Leu Gin Ala Gly Leu Glu Val Glu Arg Leu Arg Leu Arg 
995 1000 1005 

His Phe Tyr His His Tyr His Ser Lys Arg Gly Met Trp Ala Thr Arg 
1010 1015 1020 

Ser Arg Lys Gly Ala Arg Ala Gin Arg Ser 
1025 1030 
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<210> 27 

<211> 3105 

<212> DNA 

<213> Mouse 



<400> 27 

atg ccg tgg ttc ccg gtg aag aag gtc cgc aag cag atg aag ctg ctg 

Met Pro Trp Phe Pro Val Lys Lys Val Arg Lys Gin Met Lys Leu Leu 

15 10 15 



48 



ctg ctg ttg ctg ctg etc acc tgc gec gcg tgg etc acg tat gtg cac 
Leu Leu Leu Leu Leu Leu Thr Cys Ala Ala Trp Leu Thr Tyr Val His 
20 25 30 



96 



egg age ctg gtg cgc ccg ggc cgc gcg eta cgc cag egg ctg ggc tac 
Arg Ser Leu Val Arg Pro Gly Arg Ala Leu Arg Gin Arg Leu Gly Tyr 
35 40 45 



144 



ggg cga gat ggg gag aag ctg acc ggt gtg acc gat age cgc gga gtc 
Gly Arg Asp Gly Glu Lys Leu Thr Gly Val Thr Asp Ser Arg Gly Val 
50 55 60 



192 



cga gtg cca teg tec aca cag agg teg gag gac teg agt gaa agt cat 
Arg Val Pro Ser Ser Thr Gin Arg Ser Glu Asp Ser Ser Glu Ser His 
65 70 75 80 



240 



gaa gag gag cag gcg ccc gag ggg egg ggc cca aac atg ctg ttt cct 
Glu Glu Glu Gin Ala Pro Glu Gly Arg Gly Pro Asn Met Leu Phe Pro 
85 90 95 



288 
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gga gga cct agg aag cca ccc cca ctg aac etc acc cac cag aca ccc 
Gly Gly Pro Arg Lys Pro Pro Pro Leu Asn Leu Thr His Gin Thr Pro 
100 105 110 



336 



cca tgg egg gaa gag ttc aaa gga cag gtg aac ctg cac gtg ttt gag 
Pro Trp Arg Glu Glu Phe Lys Gly Gin Val Asn Leu His Val Phe Glu 
115 120 125 



384 



gac tgg tgt gga ggt get gtg ggc cac ctg aga egg aat ctg cac ttc 
Asp Trp Cys Gly Gly Ala Val Gly His Leu Arg Arg Asn Leu His Phe 
130 135 140 



432 



cca etc ttt cct cac act cgt act acg gtg aca aag tta get gtg tec 
Pro Leu Phe Pro His Thr Arg Thr Thr Val Thr Lys Leu Ala Val Ser 
145 150 155 160 



480 



cct aag tgg aag aac tat gga etc egg att ttt ggc ttc ate cac cca 
Pro Lys Trp Lys Asn Tyr Gly Leu Arg He Phe Gly Phe He His Pro 
165 170 175 



528 



gee aga gat gga gac ate cag ttc tct gtg get teg gat gac aac tct 
Ala Arg Asp Gly Asp He Gin Phe Ser Val Ala Ser Asp Asp Asn Ser 
180 185 190 



576 



gag ttc tgg ctg agt ttg gat gag age cca gca gee gee cag ctt gta 624 
Glu Phe Trp Leu Ser Leu Asp Glu Ser Pro Ala Ala Ala Gin Leu Val 
195 200 205 
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gcc ttt gtg ggc aag act ggc tec gag tgg acc gca cct gga gaa ttc 672 
Ala Phe Val Gly Lys Thr Gly Ser Glu Trp Thr Ala Pro Gly Glu Phe 
210 215 • 220 

acc aag ttc age tec cag gtg tct aag cca cgt egg etc atg gcc tec 720 
Thr Lys Phe Ser Ser Gin Val Ser Lys Pro Arg Arg Leu Met Ala Ser 
225 230 235 240 

egg aga tac tac ttt gaa ctg etc cac aag caa gat gac aag ggt tea 768 
Arg Arg Tyr Tyr Phe Glu Leu Leu His Lys Gin Asp Asp Lys Gly Ser 
245 250 255 

gac cat gtg gaa gtg ggt tgg cga get ttc ctg cct ggt ctg aag ttc 816 
Asp His Val Glu Val Gly Trp Arg Ala Phe Leu Pro Gly Leu Lys Phe 
260 265 270 

gag ate att gat tct get cac att tec ctg tac aca gat gag tea tct 864 
Glu He He Asp Ser Ala His He Ser Leu Tyr Thr Asp Glu Ser Ser 
275 280 285 

ctg aag atg gac cat gtg gcc cat gtg cct cag tct cca gcc age cac 912 
Leu Lys Met Asp His Val Ala His Val Pro Gin Ser Pro Ala Ser His 
290 195 300 

ata gga gga ttc ccg ccg cag ggg gaa ccc age gcc gac atg ctg cac 960 
He Gly Gly Phe Pro Pro Gin Gly Glu Pro Ser Ala Asp Met Leu His 
305 310 315 320 

cca gac ccc agg gat acc ttc ttc etc act cct egg atg gaa cct ttg 1008 
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Pro Asp Pro Arg Asp Thr Phe Phe Leu Thr Pro Arg Met Glu Pro Leu 
325 330 335 

age ctg gag aat gtt ctg gag ccc tgt gec tat gec ccc acc tat ate 1056 
Ser Leu Glu Asn Val Leu Glu Pro Cys Ala Tyr Ala Pro Thr Tyr He 
340 345 350 

etc aag gat ttc ccc ata gee aga tac caa gga eta cag ttt gtg tac 1104 
Leu Lys Asp Phe Pro He Ala Arg Tyr Gin Gly Leu Gin Phe Val Tyr 
355 360 365 

ctg tec ttc ate tac ccc aat gac cat acc cgt etc act cac atg gag 1152 
Leu Ser Phe He Tyr Pro Asn Asp His Thr Arg Leu Thr His Met Glu 
370 375 380 

aca gac aac aag tgc ttc tac cgt gag tec cca eta tac ctg gaa agg 1200 
Thr Asp Asn Lys Cys Phe Tyr Arg Glu Ser Pro Leu Tyr Leu Glu Arg 
385 390 395 400 

ttt ggg ttc tat aaa tac atg aaa atg gac aag gag gag gga gag gaa 1248 
Phe Gly Phe Tyr Lys Tyr Met Lys Met Asp Lys Glu Glu Gly Glu Glu 
405 410 415 

gat gag gag gaa gaa gtt cag cgt aga gee ttc etc ttc etc aac cca 1296 
Asp Glu Glu Glu Glu Val Gin Arg Arg Ala Phe Leu Phe Leu Asn Pro 
420 425 430 

gat gac ttc ctg gat gag gag gat gag cag gat ctg.tta gac age ctg 1344 
Asp Asp Phe Leu Asp Glu Glu Asp Glu Gin Asp Leu Leu Asp Ser Leu 
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435 440 445 

gag ccc acc gat gca tct gta cag cag age cac agg acc ccc acc cca 1392 
Glu Pro Thr Asp Ala Ser Val Gin Gin Ser His Arg Thr Pro Thr Pro 
450 455 460 



1440 



1488 



gca gec tec act gga acg aca gee age ccg acc cca cct aca act agt 
Ala Ala Ser Thr Gly Thr Thr Ala Ser Pro Thr Pro Pro Thr Thr Ser 
465 470 475 480 

cct ctg gac gag cag acc etc aga cac tec egg gca ctg aat tgg gee 
Pro Leu Asp Glu Gin Thr Leu Arg His Ser Arg Ala Leu Asn Trp Ala 
485 490 495 



cca cgc ccc ctg ccc etc ttc ttg ggg cga get cca cct ccc cga act 1536 
Pro Arg Pro Leu Pro Leu Phe Leu Gly Arg Ala Pro Pro Pro Arg Thr 
500 505 510 

gtg gag aag teg cct tea aag gtg tac gtg acc agg gtc cga cct gga 1584 
Val Glu Lys Ser Pro Ser Lys Val Tyr Val Thr Arg Val Arg Pro Gly 
515 520 525 

cag egg get tec ccg agg gca ttg cga gac tea ccc tgg cca ccc ttc 1632 
Gin Arg Ala Ser Pro Arg Ala Leu Arg Asp Ser Pro Trp Pro Pro Phe 
530 535 540 

cct ggc gtc ttc ctg cgc ccc aag cct ctg ccc aga gta cag ctg egg 1680 
Pro Gly Val Phe Leu Arg Pro Lys Pro Leu Pro Arg Val Gin Leu Arg 
545 550 555 560 
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gta ccc cca cat cca cct egg acc cag ggc tat agg acc agt ggc ccc 1728 
Val Pro Pro His Pro Pro Arg Thr Gin Gly Tyr Arg Thr Ser Gly Pro 
565 570 575 

aag gtc aca gaa eta aag ccc cca gtc agg gee cag acc age cag gga 1776 
Lys Val Thr Glu Leu Lys Pro Pro Val Arg Ala Gin Thr Ser Gin Gly 
580 585 590 

ggc egg gag ggc cag tta cat gga cag gga etc atg gtg ccc aca gtg 1824 
Gly Arg Glu Gly Gin Leu His Gly Gin Gly Leu Met Val Pro Thr Val 
595 600 605 

gac ttg aac tec tea gtg gaa aca cag cct gtg act tec ttc ctg age 1872 
Asp Leu Asn Ser Ser Val Glu Thr Gin Pro Val Thr Ser Phe Leu Ser 
610 615 620 

ttg tct cag gta tec agg cca cag ctg cca gga gag ggt gaa gaa ggg 1920 
Leu Ser Gin Val Ser Arg Pro Gin Leu Pro Gly Glu Gly Glu Glu Gly 
625 630 635 640 

gag gag gat ggg gee cca ggt gat gag gee aca tea gaa gac agt gag 1968 
Glu Glu Asp Gly Ala Pro Gly Asp Glu Ala Thr Ser Glu Asp Ser Glu 
645 650 655 

gaa gag gag gag ccg gee get ggg egg ccc ctg ggt cgc tgg egg gag 2016 
Glu Glu Glu Glu Pro Ala Ala Gly Arg Pro Leu Gly Arg Trp Arg Glu 
660 665 670 
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gat gcc ate aac tgg cag cgc acg ttc age gtg ggc gee atg gac ttc 2064 

Asp Ala He Asn Trp Gin Arg Thr Phe Ser Val Gly Ala Met Asp Phe 

675 680 685 



gag etc ctg cgc tct gac tgg aac gac ctg cgc tgt aac gta tec ggg 2112 
Glu Leu Leu Arg Ser Asp Trp Asn Asp Leu Arg Cys Asn Val Ser Gly 
690 695 700 

aac ctg caa ctt cct gag gcc gaa gcg gtg gat gta gtg get cag tac 2160 
Asn Leu Gin Leu Pro Glu Ala Glu Ala Val Asp Val Val Ala Gin Tyr 
705 710 715 720 

atg gag egg eta aat gca aag cat ggc ggg cgc ttc teg ctt eta cgc 2208 
Met Glu Arg Leu Asn Ala Lys His Gly Gly Arg Phe Ser Leu Leu Arg 
725 730 735 

ate gtg aac gtg gag aag cgc cgc gac tct gca cgc ggg age cgc ttc 2256 
He Val Asn Val Glu Lys Arg Arg Asp Ser Ala Arg Gly Ser Arg Phe 
740 745 750 

etc ctg gaa ctg gaa ttg caa gag cgc gga ggg age cgc cag cgc eta 2304 
Leu Leu Glu Leu Glu Leu Gin Glu Arg Gly Gly Ser Arg Gin Arg Leu 
755 760 765 



tec gaa tac gtc ttc ctg egg ttg ccc gga gcc cgc gtt ggg gac gaa 
Ser Glu Tyr Val Phe Leu Arg Leu Pro Gly Ala Arg Val Gly Asp Glu 
770 775 780 

gat gga gaa agt ccc gag ccg cct cca gcc gcc teg ate cac cca gac 



2352 



2400 
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Asp Gly Glu Ser Pro Glu Pro Pro Pro Ala Ala Ser He His Pro Asp 
785 790 795 800 

agt cgc cca gag etc tgc egg cct ttg cat ctg gee tgg cgt cag gat 2448 
Ser Arg Pro Glu Leu Cys Arg Pro Leu His Leu Ala Trp Arg Gin Asp 
805 810 815 

gtc atg gtt cat ttc att gta cca gtg aag aat cag gcg cgc tgg gta 2496 
Val Met Val His Phe He Val Pro Val Lys Asn Gin Ala Arg Trp Val 
820 825 830 

gtg cag ttc ctg gca gat atg acc gcg ctg cat gtg cat acg ggg gac 2544 
Val Gin Phe Leu Ala Asp Met Thr Ala Leu His Val His Thr Gly Asp 
835 840 845 

teg tac ttc aac ate ate ttg gtg gac ttt gag age gag gac atg gat 2592 
Ser Tyr Phe Asn lie He Leu Val Asp Phe Glu Ser Glu Asp Met Asp 
850 855 860 

gtg gag egg gee ctg cgt gcg get cag eta cct egg tac cag tac ttg 2640 
Val Glu Arg Ala Leu Arg Ala Ala Gin Leu Pro Arg Tyr Gin Tyr Leu 
865 870 875 880 

aaa cga act gga aac ttc gag cgc tct gca ggc ctg caa act gga gtg 2688 
Lys Arg Thr Gly Asn Phe Glu Arg Ser Ala Gly Leu Gin Thr Gly Val 
885 890 895 

gat gee gtg gag gac ccc age age ate gtt ttc etc tgt gac ctg cac 2736 
Asp Ala Val Glu Asp Pro Ser Ser lie Val Phe Leu Cys Asp Leu His 
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900 905 910 

ate cac ttc cca cct aat ate ctg gac age ate cgc aag cat tgc gtg 2784 
He His Phe Pro Pro Asn He Leu Asp Ser He Arg Lys His Cys Val 
915 920 925 

gag ggc aag ctg gee ttc gee cct gtg gtc atg cgt ctg ggc tgt gga 2832 
Glu Gly Lys Leu Ala Phe Ala Pro Val Val Met Arg Leu Gly Cys Gly 
930 935 940 

age tea ccg tgg gac cca cat ggt tac tgg gaa gtg aat gga ttt ggc 2880 
Ser Ser Pro Trp Asp Pro His Gly Tyr Trp Glu Val Asn Gly Phe Gly 
945 950 955 960 

etc ttt ggg ate tac aaa tea gac ttt gac aga gta gga ggc atg aac 2928 
Leu Phe Gly He Tyr Lys Ser Asp Phe Asp Arg Val Gly Gly Met Asn 
965 970 975 

act gag gag ttc cgt gac cag tgg gga ggc gag gac tgg gaa ctt ctt 2976 
Thr Glu Glu Phe Arg Asp Gin Trp Gly Gly Glu Asp Trp Glu Leu Leu 
980 985 990 

gac agg gtc ctg cag gca ggg ctg gag gtg gag agg ctt cga ctg cga 3024 
Asp Arg Val Leu Gin Ala Gly Leu Glu Val Glu Arg Leu Arg Leu Arg 
995 1000 1005 

cac ttc tac cac cac tat cac teg aag cga ggc atg tgg gee aca cgc 3072 
His Phe Tyr His His Tyr His Ser Lys Arg Gly Met Trp Ala Thr Arg 
1010 1015 1020 
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age cgc aaa ggt gec cgc gca cag cga tec tga 3105 
Ser Arg Lys Gly Ala Arg Ala Gin Arg Ser 
1025 1030 



<210> 28 

<211> 986 

<212> PRT 

<213> Mouse 

<400> 28 

Met Gly Ser Pro Arg Ala Ala Leu Leu Met Leu Leu Leu Arg Pro He 
15 10 15 

Lys Leu Leu Arg Arg Arg Phe Arg Leu Leu Leu Leu Leu Ala Val Val 
20 25 30 

Ser Val Gly Leu Trp Thr Leu Tyr Leu Glu Leu Val Ala Ser Ala Gin 
35 40 45 

Ala Gly Gly Asn Pro Leu Asn His Arg Tyr Gly Ser Trp Arg Glu Leu 
50 55 60 

Ala Lys Ala Leu Ala Ser Arg Asn He Pro Ala Val Asp Pro Asn Leu 
65 70 75 80 

Gin Phe Tyr Arg Pro Gin Arg Leu Ser Leu Lys Asp Gin Glu He Ala 
85 90 95 
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Arg Ser Arg Ser Arg Asn Ser Ser Tyr Leu Lys Trp Asn Lys Pro Val 
100 105 110 

Pro Trp Leu Ser Glu Phe Arg Gly His Ala Asn Leu His Val Phe Glu 
115 120 125 

Asp Trp Cys Gly Ser Ser He Gin Gin Leu Arg Asn Asn Leu His Phe 
130 135 140 

Pro Leu Tyr Pro His He Arg Thr Thr Leu Arg Lys Leu Ala Val Ser 
145 150 155 160 

Pro Lys Trp Thr Asn Tyr Gly Leu Arg He Phe Gly Tyr Leu His Pro 
165 170 175 

Phe Thr Asp Gly Lys He Gin Phe Ala lie Ala Ala Asp Asp Asn Ala 
180 185 190 

Glu Phe Trp Leu Ser Arg Asp Asp Gin Val Ser Gly Leu Gin Leu Leu 
195 • 200 205 

Ala Ser Val Gly Lys Thr Gly Lys Glu Trp Thr Ala Pro Gly Glu Phe 
210 215 220 

Gly Lys Phe Gin Ser Gin He Ser Lys Pro Val Ser Leu Ser Ala Ser 
225 230 235 240 

Leu Arg Tyr Tyr Phe Glu Val Leu His Lys Gin Asn Asp Glu Gly Thr 
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245 250 255 

Asp His Val Glu Val Ala Trp Arg Arg Asn Asp Pro Gly Ala Lys Phe 
260 265 270 

Thr He He Asp Ser Pro Phe Leu Ser Leu Phe Thr Asn Glu Thr He 
275 280 285 

Leu Arg Met Asp Glu Val Gly His He Pro Gin Thr Ala Ala Ser His 
290 295 300 

Val Gly Ser Ser Asn Thr Pro Pro Arg Asp Glu Gin Pro Pro Ala Asp 
305 310 315 320 

Met Leu Arg Pro Asp Pro Arg Asp Thr Leu Phe Arg Val Pro Leu He 
325 330 335 

Ala Lys Ser His Leu Arg His Val Leu Pro Asp Cys Pro Tyr Lys Pro 
340 345 350 

Ser Tyr Leu Val Asp Gly Leu Pro Leu Gin Arg Tyr Gin Gly Leu Arg 
355 360 365 

Phe Val His Leu Ser Phe Val Tyr Pro Asn Asp Tyr Thr Arg Leu Ser 
370 375 380 

His Met Glu Thr His Asn Lys Cys Phe Tyr Gin Glu Ser Ala Tyr Asp 
385 390 395 400 
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Gin Asp Arg Ser Ser Phe Gin Glu Tyr He Lys Met Asp Lys Pro Glu 
405 410 415 



Lys His Gly Pro Glu Gin Pro Ala Gly Leu Glu Asp Gly Leu Leu Glu 
420 425 430 

Glu Ser Gin Tyr Glu Asp Val Pro Glu Glu He Pro Thr Ser Gin Asp 
435 440 445 

Gin Asn Thr Gly He Gin Gly Arg Lys Gin Lys Thr He Ser Thr Pro 
450 455 460 

Gly Leu Gly Val Thr Asp Tyr His Leu Arg Lys Leu Leu Ala Arg Ser 
465 470 475 480 

Gin Ser Gly Pro Val Ala Pro Leu Ser Lys Gin Asn Ser Thr Thr Ala 
485 490 495 

Phe Pro Thr Arg Thr Ser Asn He Pro Val Gin Arg Pro Glu Lys Ser 
500 505 510 

Pro Val Pro Ser Arg Asp Leu Ser His Ser Asp Gin Gly Ala Arg Arg 
515 520 525 

Asn Leu Pro Leu He Gin Arg Ala Arg Pro Thr Gly Asp Arg Pro Gly 
530 535 540 

Lys Thr Leu Glu Gin Ser Gin Trp Leu Asn Gin Val Glu Ser Phe He 
545 550 555 560 
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Ala Glu Gin Arg Arg Gly Asp Arg He Glu Pro Pro Thr Pro Ser Arg 
565 570 575 

Gly Trp Arg Pro Glu Glu Asp Val Val He Ala Ala Asp Gin Glu Gly 
580 585 590 

Glu Val Glu Glu Glu Glu Glu Gly Glu Asp Glu Glu Glu Asp Met Ser 
595 600 605 

Glu Val Phe Glu Tyr Val Pro Met Phe Asp Pro Val Val Asn Trp Gly 
610 615 620 

Gin Thr Phe Ser Ala Gin Asn Leu Asp Phe Gin Ala Leu Arg Thr Asp 
625 630 635 640 

Trp He Asp Leu Asn Cys Asn Thr Ser Gly Asn Leu Leu Leu Pro Glu 
645 650 655 

Gin Glu Ala Leu Glu Val Thr Arg Val Phe Leu Arg Lys Leu Ser Gin 
660 665 670 

Arg Thr Arg Gly Arg Tyr Gin Leu Gin Arg He Val Asn Val Glu Lys 
675 680 685 

Arg Gin Asp Arg Leu Arg Gly Gly Arg Tyr Phe Leu Glu Leu Glu Leu 
690 695 700 

Leu Asp Gly Gin Arg Leu Val Arg Leu Ser Glu Tyr Val Ser Thr Arg 
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705 710 715 720 

Gly Trp Arg Gly Gly Asp His Pro Gly Arg Glu Asp Thr Glu Ala Arg 
725 730 735 

Asn Leu Gin Gly Leu Val Trp Ser Pro Arg Asn Arg His Arg His Val 
740 745 750 

Leu Asn Ala Gin Asp Pro Glu Pro Lys Leu Cys Trp Pro Gin Gly Phe 
755 760 765 

Ser Trp Asn His Arg Ala Val Val His Phe He Val Pro Val Lys Asn 
770 775 780 

Gin Ala Arg Trp Val Gin Gin Phe He Arg Asp Met Glu Ser Leu Ser 
785 790 795 800 

Gin Val Thr Gly Asp Ala His Phe Ser He He He Thr Asp Tyr Ser 
805 810 815 

Ser Glu Asp Met Asp Val Glu Met Ala Leu Lys Arg Ser Arg Leu Arg 
820 825 830 

Ser Tyr Gin Tyr Leu Lys Leu Ser Gly Asn Phe Glu Arg Ser Ala Gly 
835 840 845 

Leu Gin Ala Gly He Asp Leu Val Lys Asp Pro His Ser He He Phe 
850 855 860 
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Leu Cys Asp Leu His He His Phe Pro Ala Gly He He Asp Thr lie 
865 870 875 880 



Arg Lys His Cys Yal Glu Gly Lys Met Ala Phe Ala Pro Met Val Met 
885 890 895 

Arg Leu His Cys Gly Ala Thr Pro Gin Trp Pro Glu Gly Tyr Trp Glu 
900 905 910 

Val Asn Gly Phe Gly Leu Leu Gly He Tyr Lys Ser Asp Leu Asp Lys 
915 920 925 

lie Gly Gly Met Asn Thr Lys Glu Phe Arg Asp Arg Trp Gly Gly Glu 
930 935 940 

Asp Trp Glu Leu Leu Asp Arg He Leu Gin Ala Gly Leu Glu Val Glu 
945 950 955 960 

Arg Leu Ser Leu Arg Asn Phe Phe His His Phe His Ser Lys Arg Gly 
965 970 975 

Met Trp Asn Arg Arg Gin Met Lys Met Pro 
980 985 



<210> 29 

<211> 2961 

<212> DNA 

<213> Mouse 
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<400> 29 

atg ggg age ccc cgc gec gcg ttg ctg atg ctg etc ctg cgc ccg ate 
Met Gly Ser Pro Arg Ala Ala Leu Leu Met Leu Leu Leu Arg Pro He 
15 10 15 



48 



aag ctg ctg agg agg cgc ttc egg ctg ctg ctg ctg etc gec gta gta 
Lys Leu Leu Arg Arg Arg Phe Arg Leu Leu Leu Leu Leu Ala Val Val 
20 25 30 



96 



teg gtg gga etc tgg act ctg tat ctg gag ctg gtg gcg teg gec cag 
Ser Val Gly Leu Trp Thr Leu Tyr Leu Glu Leu Val Ala Ser Ala Gin 
35 40 45 



144 



gec ggc ggg aac ccc ctg aac cac agg tat ggc age tgg cga gaa ctg 
Ala Gly Gly Asn Pro Leu Asn His Arg Tyr Gly Ser Trp Arg Glu Leu 
50 55 60 



192 



gec aag gec eta gec age agg aac ate cca gec gtt gat ccg aat etc 
Ala Lys Ala Leu Ala Ser Arg Asn He Pro Ala Val Asp Pro Asn Leu 
65 70 75 80 



240 



caa ttc tac cgt ccc cag egg ctg age etc aag gac caa gaa att gec 
Gin Phe Tyr Arg Pro Gin Arg Leu Ser Leu Lys Asp Gin Glu He Ala 
85 90 95 



288 



cga agt agg agt agg aac agt age tac ctg aag tgg aac aag cct gtc 
Arg Ser Arg Ser Arg Asn Ser Ser Tyr Leu Lys Trp Asn Lys Pro Val 
100 105 HO 



336 
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ccc tgg etc tea gag ttc egg ggc cac gee aac eta cat gtg ttt gaa 384 
Pro Trp Leu Ser Glu Phe Arg Gly His Ala Asn Leu His Val Phe Glu 
115 120 125 

gac tgg tgt ggc age tec ate caa cag ctg agg aac aac ctg cac ttc 432 
Asp Trp Cys Gly Ser Ser lie Gin Gin Leu Arg Asn Asn Leu His Phe 
130 135 140 

cca etc tac ccc cac ate cgc aca act ctg agg aag ctg get gtg tec 480 
Pro Leu Tyr Pro His He Arg Thr Thr Leu Arg Lys Leu Ala Val Ser 
145 150 155 160 

ccc aag tgg acc aac tat ggc etc cgc ata ttt ggc tat ctg cac cct 528 
Pro Lys Trp Thr Asn Tyr Gly Leu Arg He Phe Gly Tyr Leu His Pro 
165 170 175 



ttc acc gat ggg aaa ate cag ttt gec ate get get gat gac aat get 
Phe Thr Asp Gly Lys He Gin Phe Ala He Ala Ala Asp Asp Asn Ala 
180 185 190 



576 



gag ttc tgg ctg agt cgt gat gac cag gtc tea ggc ctt cag ctg ctg 
Glu Phe Trp Leu Ser Arg Asp Asp Gin Val Ser Gly Leu Gin Leu Leu 
195 200 205 



624 



gec age gtg ggc aag aca gga aag gaa tgg aca gec cct gga gag ttt 
Ala Ser Val Gly Lys Thr Gly Lys Glu Trp Thr Ala Pro Gly Glu Phe 
210 215 220 



672 
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gta ggc tec tec aac act cct ccc egg gat gag cag ccc cca get gac 
Val Gly Ser Ser Asn Thr Pro Pro Arg Asp Glu Gin Pro Pro Ala Asp 
305 310 315 320 

atg ctg egg cct gac cct egg gac acc etc ttt cga gtg cct ctg ate 
Met Leu Arg Pro Asp Pro Arg Asp Thr Leu Phe Arg Val Pro Leu He 
325 330 335 

gee aag tec cat ctg cgc cac gtc ctg ccc gat tgt ccc tac aaa ccc 



720 



768 



816 



ggg aaa ttt cag agt caa att tec aag cca gtg agt tta tea gee tec 
Gly Lys Phe Gin Ser Gin He Ser Lys Pro Val Ser Leu Ser Ala Ser 
225 230 235 240 

etc agg tac tac ttt gag gtc ctg cac aag caa aat gat gaa ggc act 
Leu Arg Tyr Tyr Phe Glu Val Leu His Lys Gin Asn Asp Glu Gly Thr 
245 250 255 

gac cac gtg gag gtc gcg tgg aga egg aat gac cct gga gee aag ttc 
Asp His Val Glu Val Ala Trp Arg Arg Asn Asp Pro Gly Ala Lys Phe 
260 265 270 



acc ate att gac tec ccc ttc tta tct etc ttt aca aat gag acc ate 864 
Thr He lie Asp Ser Pro Phe Leu Ser Leu Phe Thr Asn Glu Thr He 
275 280 285 

eta agg atg gat gag gtg ggc cat ate cca cag aca gca gee age cat 912 
Leu Arg Met Asp Glu Val Gly His He Pro Gin Thr Ala Ala Ser His 
290 295 300 



960 



1008 



1056 
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Ala Lys Ser His Leu Arg His Val Leu Pro Asp Cys Pro Tyr Lys Pro 
340 345 350 

age tac ctg gtg gat gga etc ccg eta cag cgc tac cag ggc etc cgt 1104 
Ser Tyr Leu Val Asp Gly Leu Pro Leu Gin Arg Tyr Gin Gly Leu Arg 
355 360 365 

ttt gtt cac ctg tec ttt gtt tat ccc aat gac tat acc cgt ctg age 1152 
Phe Val His Leu Ser Phe Val Tyr Pro Asn Asp Tyr Thr Arg Leu Ser 
370 375 380 

cac atg gag acc cat aat aaa tgt ttc tac caa gaa agt gec tat gac 1200 
His Met Glu Thr His Asn Lys Cys Phe Tyr Gin Glu Ser Ala Tyr Asp 
385 390 395 400 

cag gac agg tec age ttc cag gaa tat ate aag atg gac aag cca gag 1248 
Gin Asp Arg Ser Ser Phe Gin Glu Tyr He Lys Met Asp Lys Pro Glu 
405 410 415 

aag cat ggc ccg gag cag cca gca ggt ttg gag gat ggc ctt eta gaa 1296 
Lys His Gly Pro Glu Gin Pro Ala Gly Leu Glu Asp Gly Leu Leu Glu 
420 425 430 

gaa tec cag tat gaa gac gta cca gag gaa ate ccc acc tct caa gac 1344 
Glu Ser Gin Tyr Glu Asp Val Pro Glu Glu He Pro Thr Ser Gin Asp 
435 440 445 

cag aat act ggg at a caa ggg aga aaa cag aag act att tec acc ccg 1392 
Gin Asn Thr Gly He Gin Gly Arg Lys Gin Lys Thr He Ser Thr Pro 
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ggg ctg ggt gtc act gac tac cac ctg egg aag etc ttg get cgc tea 
Gly Leu Gly Val Thr Asp Tyr His Leu Arg Lys Leu Leu Ala Arg Ser 
465 470 475 480 



1440 



cag agt ggc cct gta gcg cct ctt tec aaa cag aac tct aca act gee 
Gin Ser Gly Pro Val Ala Pro Leu Ser Lys Gin Asn Ser Thr Thr Ala 
485 490 495 
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ttt cca acc agg aca age aac ate cca gtc cag egg cca gag aaa age 
Phe Pro Thr Arg Thr Ser Asn He Pro Val Gin Arg Pro Glu Lys Ser 
500 505 510 
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cct gtg ccc age cga gat ttg tct cat tct gac cag ggg gee egg agg 
Pro Val Pro Ser Arg Asp Leu Ser His Ser Asp Gin Gly Ala Arg Arg 
515 520 , 525 
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aac ctg cct etc ate cag aga gee agg ccc act ggt gac aga cct ggg 
Asn Leu Pro Leu He Gin Arg Ala Arg Pro Thr Gly Asp Arg Pro Gly 
530 535 540 
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aag act ctt gag cag tec cag tgg ctg aat caa gtg gaa tec ttc att 
Lys Thr Leu Glu Gin Ser Gin Trp Leu Asn Gin Val Glu Ser Phe He 
545 550 555 560 
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get gag cag aga agg gga gac agg ata gag cct cca acc ccc age agg 
Ala Glu Gin Arg Arg Gly Asp Arg He Glu Pro Pro Thr Pro Ser Arg 
565 570 575 
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ggc tgg cgt cct gag gag gac gtg gtg ata gcg gcg gac cag gaa gga 1776 
Gly Trp Arg Pro Glu Glu Asp Val Val He Ala Ala Asp Gin Glu Gly 
580 585 590 

gaa gtg gag gag gag gaa gag ggg gaa gat gag gaa gaa gat atg agt 1824 
Glu Val Glu Glu Glu Glu Glu Gly Glu Asp Glu Glu Glu Asp Met Ser 

595 600 605 

gag gtg ttc gaa tat gtg cct atg ttt gac cca gtg gtg aac tgg ggc 1872 
Glu Val Phe Glu Tyr Val Pro Met Phe Asp Pro Val Val Asn Trp Gly 
610 615 620 

cag acc ttc age get cag aac etc gac ttc caa gec ctg aga acc gac 1920 
Gin Thr Phe Ser Ala Gin Asn Leu Asp Phe Gin Ala Leu Arg Thr Asp 
625 630 635 640 

tgg ate gac ctg aac tgt aac aca teg ggc aac ctg ctg ctt ccg gag 1968 
Trp He Asp Leu Asn Cys Asn Thr Ser Gly Asn Leu Leu Leu Pro Glu 
645 650 655 

cag gag gec ctg gag gtc aca egg gtc ttc ctg aga aag etc age cag 2016 
Gin Glu Ala Leu Glu Val Thr Arg Val Phe Leu Arg Lys Leu Ser Gin 
660 665 670 

agg acc egg ggg aga tac cag ctg cag cgc att gtg aat gtg gag aag 2064 
Arg Thr Arg Gly Arg Tyr Gin Leu Gin Arg He Val Asn Val Glu Lys 
675 680 685 

cgc cag gac egg ctg cgc ggg ggg cgc tac ttc ctg gag ctt gaa ctg 2112 
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Arg Gin Asp Arg Leu Arg Gly Gly Arg Tyr Phe Leu Glu Leu Glu Leu 
690 695 700 

ctg gat ggc caa cgc ctg gta egg etc teg gag tac gtg tec act aga 2160 
Leu Asp Gly Gin Arg Leu Val Arg Leu Ser Glu Tyr Val Ser Thr Arg 
705 710 715 720 

ggc tgg egg gga ggt gac cac cca ggc agg gag gac aca gaa get egg 2208 
Gly Trp Arg Gly Gly Asp His Pro Gly Arg Glu Asp Thr Glu Ala Arg 
725 730 735 

aac ctg cag ggt ctg gtc tgg age cca cgc aac cgt cac aga cat gtc 2256 
Asn Leu Gin Gly Leu Val Trp Ser Pro Arg Asn Arg His Arg His Val 
740 745 750 

ctg aat gec cag gat cca gag ccc aag etc tgc tgg ccc caa ggt ttc 2304 
Leu Asn Ala Gin Asp Pro Glu Pro Lys Leu Cys Trp Pro Gin Gly Phe 
755 760 765 

tec tgg aac cat cga get gtg gtc cac ttt att gtg cct gtg aag aac 2352 
Ser Trp Asn His Arg Ala Val Val His Phe He Val Pro Val Lys Asn 
770 775 780 

cag get cgc tgg gtg cag cag ttc ate aga gat atg gag age ctg tec 2400 
Gin Ala Arg Trp Val Gin Gin Phe He Arg Asp Met Glu Ser Leu Ser 
785 790 795 800 

caa gtc act gga gat gca cat ttc age ate att ate aca gac tat age 2448 
Gin Val Thr Gly Asp Ala His Phe Ser He He lie Thr Asp Tyr Ser 
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805 810 815 

agt gag gac atg gat gtg gag atg get ctg aag agg tec aga ctg egg 2496 
Ser Glu Asp Met Asp Val Glu Met Ala Leu Lys Arg Ser Arg Leu Arg 
820 825 830 

age tac cag tac ctg aag ctg agt gga aac ttt gag cgc tct get gga 2544 
Ser Tyr Gin Tyr Leu Lys Leu Ser Gly Asn Phe Glu Arg Ser Ala Gly 
835 840 845 

ctg cag get ggc ata gac ctg gtg aag gat cca cac age ate ate ttc 2592 
Leu Gin Ala Gly He Asp Leu Val Lys Asp Pro His Ser He He Phe 
850 855 860 

etc tgt gac ctg cac ate cac ttt cca gca gga ate att gat acc ate 2640 
Leu Cys Asp Leu His He His Phe Pro Ala Gly He He Asp Thr He 
865 870 875 880 

egg aag cac tgt gtg gag ggc aag atg gec ttt gee ccc atg gtg atg 2688 
Arg Lys His Cys Val Glu Gly Lys Met Ala Phe Ala Pro Met Val Met 
885 890 895 

egg ctg cac tgt ggg gee acc cca cag tgg cct gag ggc tac tgg gaa 2736 
Arg Leu His Cys Gly Ala Thr Pro Gin Trp Pro Glu Gly ,Tyr Trp Glu 
900 905 910 

gta aat gga ttt gga ctg etc ggg ate tac aag tct gac ctg gac aag 2784 
Val Asn Gly Phe Gly Leu Leu Gly He Tyr Lys Ser Asp Leu Asp Lys 
915 920 925 
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ate gga ggc atg aac acc aag gag ttc aga gac c'gc tgg gga ggg gag 2832 
He Gly Gly Met Asn Thr Lys Glu Phe Arg Asp Arg Trp Gly Gly Glu 
930 935 940 

gac tgg gag ctg ctg gac agg att etc caa gca ggc ctg gaa gtg gag 2880 
Asp Trp Glu Leu Leu Asp Arg He Leu Gin Ala Gly Leu Glu Val Glu 
945 950 955 960 

egg etc tec etc agg aac ttc ttc cat cac ttc cat tec aag cga ggc 2928 
Arg Leu Ser Leu Arg Asn Phe Phe His His Phe His Ser Lys Arg Gly 
965 970 975 

atg tgg aac cgt cgc caa atg aag atg ccg tga 2961 
Met Trp Asn Arg Arg Gin Met Lys Met Pro 
980 985 

<210> 30 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: FLAG peptide 
<400> 30 

Asp Tyr Lys Asp Asp Asp Asp Lys 
1 5 
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<210> 31 

<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying 
mNGalNAc-Tl cDNA 

<400> 31 

cccaagcttc gcctgggcta cgggcgagat 30 

<210> 32 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying 
mNGalNAc-Tl cDNA 

<400> 32 

gctctagact caggatcgct gtgcgcgggc a 31 
<210> 33 
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<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying 
mNGalNAc-T2 cDNA 

<400> 33 

cccaagcttc ggcccaggcc ggcgggaacc 30 

<210> 34 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer used in PCR for amplifying 
mNGalNAc-Tl cDNA 

<400> 34 

ggaattctca cggcatcttc atttggcga 29 
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